@Oliver Bertuch thanks for chatting about #11753 yesterday!
I suppose the next step is to figure out a good API to unset (set to null) the list of allowed dataset types for a collection. Any suggestions?
These are the docs I'm looking at but I don't see anything about unsetting an attribute:
This should be a DELETE REST endpoint, shouldn't it?
Hmm, so you're thinking a new API or at least a new verb. :thinking:
Also, should it be allowed to actually not allow any types?
I was trying to reuse existing functionality. :sweat_smile:
This doesn't make much sense from a business perspective, does it?
Well, if it's null, it only allows the "dataset" dataset type.
But that's mostly due to backward compatibility, isn't it?
yeah
I'm trying to take a light touch with this feature.
How about making this more explicit? Have a migration that will add the dataset type to all that have none configured.
That would rule out the "null" case
But now that I'm converting allowedDatasetTypes from a string to a ManyToMany list, things are getting more complicated. I need to be able to unset the list.
That's what we did at the dataset level when we introduced dataset types. All existing datasets were set to the "dataset" dataset type (using Flyway).
Here's an idea: what if for now we make the list non-optional and a migration adds the minimum default to all. At a future point in time, deleting the list and replacing it with null or empty would mean "inherit"
So we can add any other calls that are necessary to do that in a new iteration
Actually, that gives me an idea. I was trying to unset the list but maybe I can just explicitly set the list to a dataset type that I'm not trying to delete.
At the moment, I'm just trying to get DatasetTypesIT to pass along with the change from a String to a List for allowedDatasetTypes.
Yes! It works! :tada:
Yeah my suggestion about not allowing "null" or empty list for now was going in this direction of "set the list to a type that still makes sense"
What happens if you delete a type itself that has been used? It shouldn't be allowed to be dropped, right?
"message": "java.lang.IllegalStateException: Dataset type with id 6 is referenced and cannot be deleted."
but now I have the workaround above, which I'm documenting
Referenced by whom? The collection or a dataset?
IMHO that is an important distinction.
Well, this worked before so I'm assuming it's the new List relationship to the collection.
Woo-hoo! Tests passing! I'll clean this up, commit, and push.
@Oliver Bertuch is this incrementally better? :sweat_smile: https://github.com/IQSS/dataverse/pull/11753/changes/d2f12dfd9eec3d782e68493b6f9293fecaf60b9e - switch allowedDatasetTypes in entity from String to a ManyToMany List
I need to look into our new storage option that (finally) came through today, so won't be able to look now.
No worries, I'll look through other stuff to fix that we talked about. Thanks again!
@Oliver Bertuch I just noticed you still have requested changes on https://github.com/IQSS/dataverse/pull/11753 . Is that right? Or should I click "dismiss review"?
Also, @Gustavo Durand picked up that PR to review it.
I have not looked at the remarks and the changes you made again. Maybe Gustavo can decide what he thinks about them.
We might have a zoom call if you'd like to join. Not sure when.
Sure. Today would be a good fit for that.
I made those changes for you :heart:
Let me take another look. Posters are printed. :smile:
Done.
Thanks!! ![]()
I did my best to reply to open questions and found a few other issues.
The most important one is certainly the missing equals() and hashCode() on DatasetType.
@Oliver Bertuch is just id and name enough? This is what VS Code is suggesting:
Screenshot 2026-03-02 at 3.42.27β―PM.png
If I check just id and name, it shows generates this:
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((id == null) ? 0 : id.hashCode());
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
DatasetType other = (DatasetType) obj;
if (id == null) {
if (other.id != null)
return false;
} else if (!id.equals(other.id))
return false;
if (name == null) {
if (other.name != null)
return false;
} else if (!name.equals(other.name))
return false;
return true;
}
Anyway, pushed: https://github.com/IQSS/dataverse/pull/11753#discussion_r2874633026
And this: https://github.com/IQSS/dataverse/pull/11753/changes#r2874761915
Sorry I was out with @Jan Range in Stuttgart. We had a fab diner at a local Brauhaus!
Do you still want input?
For the other stuff, @Gustavo Durand @Juan Pablo Tosca Villanueva and I decided we'd punt. I'm going to create a tech debt issue for dataset types. Maybe you can help edit the description, @Oliver Bertuch, and we'll decide on the scope and size of the next push, whenever it is.
And I'm still open to adding small things to the PR we're working on now.
Sounds good to me! I'm just the messenger pointing at things, you decide the course, cap'n :man_rowing_boat:
Ha. I'll let you know when I've created it.
I literally wrote "create issue about corners that were cut" :sweat_smile:
But on the plus side, hopefully we'll get #11753 merged in time for 6.10.
@Oliver Bertuch I'm clicking "re-request review" on https://github.com/IQSS/dataverse/pull/11753
Screenshot 2026-03-03 at 2.22.59β―PM.png
I don't want this to be a blocker for merging. (I know that worst case, we can always dismiss your review.)
@Oliver Bertuch hi, just checking in. Merging is blocked:
Screenshot 2026-03-04 at 7.41.25β―AM.png
Just today I had a request by a researcher asking for Instruments support. Any objections to add those? https://support.datacite.org/docs/can-i-register-a-doi-for-an-instrument
(Not as a part of this PR)
Any types can be added freely.
Just cook up a JSON file.
But we'd need to extend the enum for resourceTypeGeneral, don't we?
Well, sure.
Like you did for reviews
If you want it to be sent to DataCite or show up in the CSL citation export.
Yeah, I'd like that. DataCite has native support for Instruments these days and it would be great to keep the data and instruments in the same place.
Sure. Future PR, though, please! :sweat_smile: Code freeze is tomorrow.
I think I said that above :wink:
So we're trying to get this one tested and merged.
Are you still thinking about the "merging is blocked" question above? :heart:
You sort of avoided talking about it. :smile:
Sorry at a conference.
I am just doing the review in parallel to a session
Worst case we'll dismiss your review so we can merge.
(As you may have noticed)
I just revisited my main concerns and find them all covered now. Approved. :check: Thanks for working on this and bearing with my lengthy remarks.
Sure! To be honest, I need to do a little more testing, so I'll be back on that branch when I get to work.
So if there's anything small you want, lemme know.
Philip Durbin π said:
Sure. Future PR, though, please! :sweat_smile: Code freeze is tomorrow.
In Spain -and Iβve heard also in Latin America, and I think in France as well-, we are receiving petitions to upload Data Management Plans to our Dataverse instances.
Could Data Management Plan be considered a new dataSetType? Should I open a new PR?
@Juan interesting! I'm think we should have a different topic about DMPs. Do you want to create one? Maybe in #community? I'm sure @Viviane Veiga and many others will be interested. By the way, in #12000 we're adding this:
As shown in a video <https://www.researchspace.com/blog/dmptool-rspace-eln-dataverse-integration>, a Data Management Plan (DPM) can be added into RSpace and the research records and associated data can then be sent to Dataverse. Dataverse generates a Persistent Identifier (PID, often a DOI) for the dataset, and RSpace automatically puts the PID link under "Research Outputs" in the DPM.
(Thanks to @Tilo Mathes and @Vaida PlankytΔ π¨ and the rest of RSpace for that integration!)
Thanks @Philip Durbin π, I will watch the video and open the new topic tomorrow. I have meetings for the rest of my workday.
@Juan @Philip Durbin π DataCite has support for a "OutputManagementPlan" type these days.
I saw that. Interesting.
@Philip Durbin π should we discuss in a new GH issue if the resource type general should be a configurable type and not hardcoded?
Or maybe a design doc? I started this one called "Resource Types" back in October: https://docs.google.com/document/d/1xUbOj0vQ629_YF0nMM3OUnfjY4QBw3WVsD_MnrMMBYI/edit?usp=sharing
With output management plans comes to mind that there are DMPs and SMPs and maybe more in the future.
Sure.
(I don't find GitHub issues to be great for long discussions.)
It would be good to collect use cases and think some more about where we wanna go with this
Should we keep the existing design doc and add another chapter/part to it?
Well, remember we have an even earlier design doc: Proposal: Supporting Multiple Dataset Types in Dataverse - https://docs.google.com/document/d/16RvGXmaPQK9DGsEEbrrFEu8mjUrN93YY_yh7ZONyMDI/edit?usp=sharing
Oh so we already have multiple. We can always link between them for tracking the history
I'm fine with whatever. New doc. New Zulip topic. A meeting. Anything but trying to have a discussion in a GitHub issue. :smile:
5 messages were moved from this topic to #dev > Architecture Decision Records (ADRs) by Oliver Bertuch.
So sorry @Philip Durbin π but I did find a small thing. :see_no_evil: Stumbled over this just this morning while looking for other things in the DataCite rTG vocabulary. https://github.com/IQSS/dataverse/pull/11753/changes#r2888545716
@Oliver Bertuch thanks, I just replied
Want me to add a comment in the code?
That may be good, as others may have the same question. Should this go into the docs as well?
Maybe. Where?
On the train ride home now. Let me take a look.
I left a few (three) suggestions on the user doc.
@Oliver Bertuch I don't suppose you have time for a call
The name is "dataset" and the displayName is "Dataset". Maybe I can edit your suggestion to make this clear.
In a public no-phone environment at the moment... Sorry. Will be back to desk later if that helps.
I was suggesting to use "Dataset" as this is the user documentation and users will see "Dataset" in the UI
Feel free to ignore any suggestions that don't resonate with you
I don't feel strongly about any of this.
Actually, since that suggestion isn't in the API Guide, I'll take it.
Actually, I took it but I'm going to change it back to lower case "dataset" because that's what is shown in the search facet (as of this PR).
Ok, I think I'm done making changes: https://github.com/IQSS/dataverse/pull/11753/commits
Juan said:
Thanks Philip Durbin π, I will watch the video and open the new topic tomorrow. I have meetings for the rest of my workday.
@Juan thanks for kicking off #community > A new supported datasetType for Data Management Plans? !
@Juan since you're obviously using dataset types (or plan to), I wanted to give you (and others) a heads that in #11753 the following backward-incompatible change is coming:
In previous releases of Dataverse, as soon as additional dataset types were added (such as "software", "workflow", etc.), they could be used by all users when creating datasets (via API only). As of this release, on a per-collection basis, superusers must allow these dataset types to be used. See #12115 and #11753.
Thanks @Philip Durbin π , We will migrate to v6.10 before starting to use dataset types.
Ok, phew :relieved:
Longer term, maybe dataset types should turn into resource types. I don't know. ![]()
It would be an architectural change. I gave you comment access on the Resource Types doc I mentioned above. But that doc need a lot more work.
Perfect. Thanks
Have you played with Zenodo? They allow you to create a bunch of different resource types.
A little bit, but we like much more Dataverse :smile: . But we will need to add at the moment DMPs, and software. Perhaps another resource types in the future.
Here's the Zenodo UI:
The OSF UI:
It seems that they do not have a specific resource type for DMPs. Even the DMPs imported from Argos have "Other" resource type: https://zenodo.org/records/3980771 . Argos and Zenodo are OpenAIRE tools.
They both seem to have the option for Output Management Plan (which, if you squint, might look like a DMP :smile: ):
Any objections if I move all these messages into the main topic (for dataset types)?
Update: moved! Sorry if it's messy! I moved the messages about potential DMP and instrument dataset types here.
Yes, you are absolutely right. I didn't find it under publication.
@Oliver Bertuch @Gustavo Durand I just posted more about nullable true vs false for displayName at https://github.com/IQSS/dataverse/pull/11753/changes#r2788164627
I might take another run at making displayName nullable=false. Wish me luck!
Wait, wait, it's already nullable=false:
@Column(nullable = false, columnDefinition = "VARCHAR(255) DEFAULT ''")
private String displayName;
@Oliver Bertuch is there anything to fix here? Or in the following?
-- Add displayname column to datasettype table
ALTER TABLE datasettype ADD COLUMN IF NOT EXISTS displayname VARCHAR(255) NOT NULL DEFAULT '';
-- Populate displayname with name but capitalize it (name=dataset becomes displayname=Dataset)
UPDATE datasettype SET displayname = CONCAT(UPPER(SUBSTRING(name, 1, 1)), SUBSTRING(name, 2));
Last updated: Apr 03 2026 at 06:08 UTC