For this sprint I picked up #10517 which is part of #10489.
I think of this project as allowing type=software for a dataset (and other types such as type=workflow, etc.)
I will, of course, be looking at Proposal: Supporting Multiple Dataset Types in Dataverse but I'm happy to get the latest thoughts from people who are interested.
@Oliver Bertuch this is important for HERMES, of course. I'm happy to hear what you think.
I made a couple commits:
As of those commits you can create a dataset of type "software". Here's a test: https://github.com/pdurbin/dataverse/blob/b3c2dce4c4ade96dabadadf5139d3868ac6b8854/src/test/java/edu/harvard/iq/dataverse/api/DatasetTypesIT.java
I had a quick glimpse at your types code today. Didn't have the time to dig deep, but I have some ideas. This is not a PR yet, where should I leave comments and thoughts?
Should this be a draft PR?
I mean, I can make a draft PR if that would help you capture some thoughts.
Here you go, a draft PR: dataset types (software, workflow, etc.) - initial support #10694
(From a new branch I pushed to IQSS instead of pdurbin, by the way.)
Thanks all, for listening to me talk about and demo dataset types so far at tech hours.
As discussed, I stopped hard-coding the dataset types in a enum and moved them into the database. Please see https://github.com/IQSS/dataverse/pull/10694/commits/c8adf259ec9684c7db3d6a2a324973e0407cdfd5
Ok as of https://github.com/IQSS/dataverse/pull/10694/commits/8593d328ea5fbb94c1435b0ff4651199dcf7abef we are sending "Dataset", "Software", or "Workflow" to DataCite (using this XML: <resourceType resourceTypeGeneral="${resourceTypeGeneral}"/>).
Here's a software example:
Screenshot-2024-07-29-at-5.16.31PM.png
(Notice that it says "Software" next to the name, pyDataverse.)
#10694 has been merged! :tada:
The next issue is this one: Implement datasetType metadata block support (at global level) #10519
"Designate specific metadata fields for a specific datasetType"
@Oliver Bertuch is this something you want?
I'm having trouble imagining what the API test would look like. :thinking:
I'm reticent to pick it up because I'm not sure what it entails.
Also, it seems like it will make a lot of changes to the old JSF UI, which we will be throwing away some day. :trash_can:
At standup I argued that #10519 seems to involve a lot of JSF work and should be deferred. We're planning on talking about it more during tech hours tomorrow.
I made this pull request a little over a month ago. I'm happy to get some (more) feedback on it:
allow links between dataset types and metadata blocks #11001
Merged! allow links between dataset types and metadata blocks #11001 :tada:
@Oliver Bertuch @Dorothea Iglezakis @Jan Range are you still interested in this feature, in general, the idea of designating a dataset as a "software" dataset? With different metadata blocks (e.g. CodeMeta) and different licenses (Apache, etc.)? Sending "software" as the type to DataCite?
I would definitely +1 this :raised_hands:
Ok, great. I've worked on the last two PRs (both merged now) and there's another to work on but I'm feeling a bit disconnected from actual users. I hope we're building the right thing!
This is the next thing to work on: Implement list of allowed datasetType licenses (at global level)ย #10520
Happy to test once #10520 is merged :smile:
The thing is that these features are API-only. We don't plan to modify the old JSF UI code. We'll be making use of the features in the new React UI but only after the new React UI gets to a certain level of readiness.
But for this feature about licenses, for example, how should it work? If you set datasetType=software, what would you like to happen with regard to licenses?
I am fine with an API-first solution. Ideally, the transfer of software to a dataset happens via some GitHub/Lab Action.
In terms of licenses, it would be great if it would switch to the subset of software licenses. I will ping Doro as she is more into the license topic and probably has some better suggestions/wishes.
Hmm, a subset of software licenses... not all software licenses? :big_smile:
Anyway, yes, please ping Doro!
Philip Durbin โ๏ธ schrieb:
Hmm, a subset of software licenses... not all software licenses? :big_smile:
Ah, fingers faster than the brain :grinning: Meant the subset of licenses meant for software :woozy_face:
Ok, makes sense!
Yes, this sounds perfectly fine. In practice, we have - at the moment - three different cases: Datasets that consist only of data, datasets that consists only of software code and datasets that consist partly of code, partly of data. For the first two parts, licensing is sort of straitforward and it would be perfect, if only data licenses (CC-Licenses, ODBl, ..) would be available for data datasets and only software licenses (MIT, GPL, BSD, Apache, ...) for software licenses. It gets trickier for the mixed datasets. Sometimes the dataset is mainly code with a bit of data or mainly data with bit of code, but in most cases, we handle the license of a mixed dataset by a custom license like darus-1854 or darus-2134**. Also interesting cases are software code that is consists of other software parts licensed under another license. The problem on this custom terms are the missing machine readability of the terms.
For these cases, it would be perfect, if more than one license could be chosen and some specification of the part of the dataset would be possible (like everything in folder xyz is licensed under license A, the rest licensed under license B). I know that there is also discussion about licensing on the file level, but this could get complicated if there are a lot of files in a dataset.
@Dorothea Iglezakis thanks! I'm glad two of the three use cases are straightforward! :sweat_smile:
I just linked to your post in a comment in the design doc. Thanks again. :heart:
In another topic I'm musing about what should go in the <head> for non-datasets. :thinking:
@Oliver Bertuch I mentioned #11589 in standup this morning. Please feel free to add a 6.7 milestone if you really want it for that release.
@Philip Durbin done! I added a test and some doc tweaks as requested.
Merged! Thanks!
Last updated: Nov 01 2025 at 14:11 UTC