Should/does Dataverse have "how to run" metadata? · community

Stream: community

Topic: Should/does Dataverse have "how to run" metadata?

Sam Grayson (Nov 09 2023 at 19:02):

Dataverse holds input data, code, and output artifacts. Users can happily read all three. But re-executing the code is more difficult and more useful than just reading it. More useful because users can tweak the code or try new input data. More difficult because being able to re-execute the code requires more than just the source code, also metadata such as R version (more generally software environment specification) and the "workflow" (what code to run in what order with what parameters, etc.).

Should Dataverse store machine-readable metadata describing how to run the code?

One possible implementation, for example, might be to have a polymorphic "software environment" field (could be requirements.txt, renv.lock, Dockerfile), and a polymorphic "workflow" field (could be a script, CWL workflow, or other kind of workflow, that kicks off the rest of the code in the Dataset (to use Dataverse's term)). Other implementations are possible. For this discussion, I want to ask if _any_ such implementation is considered "in scope" for the Dataverse project by the community (especially devs!), or is it considered better handled by other tools?

Philip Durbin 🚀 (Nov 09 2023 at 20:44):

Hi! We have an ongoing grant with the NIH (GREI) and as part of "aim 3" we definitely plan to poke around in this area. I don't think much has been written on this yet but there are a few notes at https://github.com/IQSS/dataverse-pm/issues/15

Philip Durbin 🚀 (Nov 09 2023 at 20:44):

It's a big topic. Overall, I'm just trying to say "yes, and we'd love to hear your ideas"! :grinning:

Philip Durbin 🚀 (Nov 09 2023 at 20:48):

CodeMeta has "build instructions": https://guides.dataverse.org/en/6.0/user/appendix.html

Sam Grayson (Nov 09 2023 at 20:48):

There's that, I also just learned about Workflow Run Crate profile of RO-Crates

Philip Durbin 🚀 (Nov 09 2023 at 20:49):

We had a guy at Harvard Medical School create a very specific custom metadata block for x-ray crystallography that included reprocessing instructions.

Philip Durbin 🚀 (Nov 09 2023 at 20:49):

Ah, you might also be interested in #dev > RO-Crate .

Sam Grayson (Nov 09 2023 at 20:49):

Ooh, do you have a link handy about the x-ray crystallography?

Philip Durbin 🚀 (Nov 09 2023 at 20:50):

xray20170831b.tsv

Last updated: Nov 01 2025 at 14:11 UTC