There's a new repo at https://github.com/gdcc/dataverse-recipes for sharing "recipes" with each other.
"A collection of code recipes and examples for interacting with Dataverse using different programming languages and tools. This repository serves as a practical resource for developers who need to integrate with Dataverse in their applications."
I talked about the new repo in the community call today and it was very well received! See the tail end of the recording and notes.
One question is remember is if sql scripts are allowed. I don't see why not!
Yes, I love Julian's doc of SQL queries:
https://docs.google.com/document/d/1-Y_iUduSxdDNeK1yiGUxe7t-Md7Fy965jp4o4m1XEoE/edit?tab=t.0#heading=h.avuoo5kf0mdt
Yes, great stuff from @Julian Gautier! ![]()
I just pushed https://github.com/gdcc/dataverse-recipes/pull/6 about using pyDataverse to create Croissant files. It's still in draft. Not ready to be merged.
A new S3 direct upload shell script! https://github.com/gdcc/dataverse-recipes/pull/8
Should each script have a dedicated directory? :thinking: That's what I'm asking at https://github.com/gdcc/dataverse-recipes/pull/9
While looking at https://github.com/gdcc/dataverse-recipes/issues/10 by @Don Richards (upgrade scripts! :tada:) I went back and looked at the conversation @Jan Range and I had about how to organize the recipes repo. :sweat_smile: -
@Jan Range also, should we standardize on dashes or underscores in the names of scripts and directories?
Philip Durbin โ๏ธ schrieb:
Should each script have a dedicated directory? :thinking: That's what I'm asking at https://github.com/gdcc/dataverse-recipes/pull/9
I think it is probably cleaner when the recipes grow in number. Also in terms of dependency management, this seems more sustainable.
Regarding shell scripts, I think it is okay not to use dedicated directories. I guess these will stay single scripts, and adding a dir does not add any value imo
well...
Did you see https://github.com/gdcc/dataverse-recipes/issues/10 ?
Sorry, I was looking at the main branch :woozy_face: In this case, directories make total sense! The proposed structure of Don looks great.
Philip Durbin โ๏ธ schrieb:
Jan Range also, should we standardize on dashes or underscores in the names of scripts and directories?
Hm that's a tough one. I personally prefer underscores for files, but do you think this is important to standardize? It makes sense for directories though - Looks cleaner.
Sorry, which do you think looks cleaner?
Dashes for directories
Since we are already using dashes, I think its good to keep it consistent
Are you saying you want dashes for directories but underscores for files? :upside_down:
We've been using dashes already, so staying with them for the directories makes sense. In terms of files, I prefer underscores, but I mean, we can change everything to underscores or vice versa.
What are your thoughts?
If we're deciding on rules, we should write them down. :big_smile:
I have the impression that the Python world likes underscores.
That makes sense :grinning:
snake_case :snake:
Indeed, I think that's my bias :grinning_face_with_smiling_eyes:
I like kebab-case for URLs and such. And I think I carry this over to everything.
"Consider using hyphens to separate words in your URLs, as it helps users and search engines identify concepts in the URL more easily. We recommend that you use hyphens (-) instead of underscores (_) in your URLs." -- https://developers.google.com/search/docs/crawling-indexing/url-structure
Is it safe to say that Pythonistas also use underscores in directory names?
Yes, it is pretty standard among Pythonistas. At least, I have not yet seen one using dashes for filenames and directories.
I'd suggest we settle with underscores for file names because the languages we list mainly use underscores. For the directories, I am fine with both, but since these are no modules/namespaces in the coding sense, dashes are fine and look nicer. What do you think?
If we're going with underscores for file names (which is fine with me) I think we should just use underscores everywhere, including directory names. That way the rule is simple.
To quote the Zen of Python, "There should be one-- and preferably only one --obvious way to do it." :big_smile:
Okay, that makes sense (and its easier)
Setting up a PR :smile:
perfect! thanks!
Should we make exceptions for languages that are strictly non-snake-case? This is the new section in the Readme:
create_dataset.py)python_create_dataset)Note: If a language convention requires it, use camel/kebab/pascal case, but make sure to align with the existing naming conventions. Examples are:
createDataset.js or create-dataset.js)CreateDataset.java)Sounds like a great start!
Do you prefer the blue note or as it is currently?
https://github.com/gdcc/dataverse-recipes/pull/12 looks fine as-is but please note that I added a commit.
Good catch! Thanks
Sure. I went ahead and approved it.
Merged!
Today in the containerization meeting we talked about how, in some cases, we might want to mention scripts as recipes without actually moving them from their repo. For example, https://jugit.fz-juelich.de/fdm/k8s/k8s-storage-benchmark by @Oliver Bertuch
I'm cooking up some Python to query the hub about installations: https://github.com/gdcc/dataverse-recipes/pull/15
Feedback on my Python is welcome, by the way. :big_smile:
I also added a script to get metrics from the hub. for a particular Dataverse installation.
(See also #python > hub.)
A load of new shell scripts from @Don Richards merged! https://github.com/gdcc/dataverse-recipes/pull/14 :tada:
New download script! https://github.com/gdcc/dataverse-recipes/pull/17
A new script for downloading Croissant :croissant: from draft datasets: https://github.com/gdcc/dataverse-recipes/pull/19
Feedback welcome! It hasn't been merged yet.
New dvcli recipes available using the Rust Dataverse CLI :raised_hands: Feedback is very welcome :heart_hands:
https://github.com/gdcc/dataverse-recipes/pull/20
Wow, extensive PR! Great!
@Philip Durbin ๐ you asked yesterday if I would like to put the script for applying database options in idempotent ways in the recipes repo. That makes me wonder: should we perhaps move stuff from the upstream repo into recipes?
Provocative question: why not make the classic installer at home in gdcc/dataverse-recipes, removing it from the main tree?
Also, if we can agree on a certain standard for example how we want shell scripts to look like (names of parameters, structure, etc), we could ship much of them with configbaker...
(Probably the image generation for configbaker needs to stay in upstream due to the Solr dependency etc, but who knows.)
Ha, it is provocative to think about moving the installer. We did move the ec2 spin up script from the main repo to https://github.com/gdcc/dataverse-ansible
And because of that, bumping the version over in ansible is part of our release process, like this: https://github.com/gdcc/dataverse-ansible/commit/bb903b82c7cc810eb8292a00d711d7fbd7bd2f83 (for Jenkins).
We can let renovate take care of that :wink:
"In the dataverse-ansible repo make bump the version in jenkins.yml and make a pull request such as https://github.com/gdcc/dataverse-ansible/pull/386. Wait for it to be merged. Note that bumping on the Jenkins side like this will mean that all pull requests will show failures in Jenkins until they are updated to the version we are releasing." -- https://guides.dataverse.org/en/6.7/developers/making-releases.html#prepare-release-branch
Jan Range said:
New
dvclirecipes available using the Rust Dataverse CLI :raised_hands: Feedback is very welcome :heart_hands:
Merged!
Also merged! Add fetch_prod.sh Script for Production to Staging/Clone Synchronization https://github.com/gdcc/dataverse-recipes/pull/23 by @Don Richards :heart:
And also merged! Scripts to detect use of draft PIDs and other not found errors https://github.com/gdcc/dataverse-recipes/pull/26
Who's next? :smile:
First JavaScript recipe merged, thanks to @Jan Range! https://github.com/gdcc/dataverse-recipes/pull/28
The "about" for the repo said "A place for Python scripts and other goodies".
I just removed Python. :smile:
Don't get me wrong, I love Python. But these days we have more languages.
Also, I just showed the js script during the weekly frontend meeting.
Hehe I think the recipe idea has grown beyond its initial intent, which is awesome!
I never dreamed we'd have so many upgrade scripts.
We probably could find a way with cloning the main tree first to add my experiment with Flyway to recipes as well. Would that be of interest, so we keep it out of the main tree?
Sure! Would you put it under "shell" or create a new folder for "sql"?
Maybe under "Java"?
Dunno
You need a Maven project for this to work
Sure, "java" is fine. For serious work! :crazy:
Last updated: Nov 01 2025 at 14:11 UTC