Stream: dev

Topic: datasetType (software, workflow, review, etc.)


view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:18):

@Oliver Bertuch thanks for chatting about #11753 yesterday!

I suppose the next step is to figure out a good API to unset (set to null) the list of allowed dataset types for a collection. Any suggestions?

These are the docs I'm looking at but I don't see anything about unsetting an attribute:

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:19):

This should be a DELETE REST endpoint, shouldn't it?

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:20):

Hmm, so you're thinking a new API or at least a new verb. :thinking:

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:20):

Also, should it be allowed to actually not allow any types?

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:20):

I was trying to reuse existing functionality. :sweat_smile:

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:20):

This doesn't make much sense from a business perspective, does it?

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:20):

Well, if it's null, it only allows the "dataset" dataset type.

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:21):

But that's mostly due to backward compatibility, isn't it?

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:21):

yeah

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:22):

I'm trying to take a light touch with this feature.

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:22):

How about making this more explicit? Have a migration that will add the dataset type to all that have none configured.

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:22):

That would rule out the "null" case

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:22):

But now that I'm converting allowedDatasetTypes from a string to a ManyToMany list, things are getting more complicated. I need to be able to unset the list.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:23):

That's what we did at the dataset level when we introduced dataset types. All existing datasets were set to the "dataset" dataset type (using Flyway).

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:25):

Here's an idea: what if for now we make the list non-optional and a migration adds the minimum default to all. At a future point in time, deleting the list and replacing it with null or empty would mean "inherit"

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:25):

So we can add any other calls that are necessary to do that in a new iteration

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:26):

Actually, that gives me an idea. I was trying to unset the list but maybe I can just explicitly set the list to a dataset type that I'm not trying to delete.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:28):

At the moment, I'm just trying to get DatasetTypesIT to pass along with the change from a String to a List for allowedDatasetTypes.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:32):

Yes! It works! :tada:

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:33):

Yeah my suggestion about not allowing "null" or empty list for now was going in this direction of "set the list to a type that still makes sense"

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:34):

What happens if you delete a type itself that has been used? It shouldn't be allowed to be dropped, right?

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:35):

"message": "java.lang.IllegalStateException: Dataset type with id 6 is referenced and cannot be deleted."

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:35):

but now I have the workaround above, which I'm documenting

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:35):

Referenced by whom? The collection or a dataset?

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 15:36):

IMHO that is an important distinction.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:37):

Well, this worked before so I'm assuming it's the new List relationship to the collection.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 15:42):

Woo-hoo! Tests passing! I'll clean this up, commit, and push.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 16:39):

@Oliver Bertuch is this incrementally better? :sweat_smile: https://github.com/IQSS/dataverse/pull/11753/changes/d2f12dfd9eec3d782e68493b6f9293fecaf60b9e - switch allowedDatasetTypes in entity from String to a ManyToMany List

view this post on Zulip Oliver Bertuch (Feb 12 2026 at 16:41):

I need to look into our new storage option that (finally) came through today, so won't be able to look now.

view this post on Zulip Philip Durbin πŸš€ (Feb 12 2026 at 16:43):

No worries, I'll look through other stuff to fix that we talked about. Thanks again!

view this post on Zulip Philip Durbin πŸš€ (Feb 25 2026 at 16:38):

@Oliver Bertuch I just noticed you still have requested changes on https://github.com/IQSS/dataverse/pull/11753 . Is that right? Or should I click "dismiss review"?

Also, @Gustavo Durand picked up that PR to review it.

view this post on Zulip Oliver Bertuch (Feb 25 2026 at 16:39):

I have not looked at the remarks and the changes you made again. Maybe Gustavo can decide what he thinks about them.

view this post on Zulip Philip Durbin πŸš€ (Feb 25 2026 at 16:44):

We might have a zoom call if you'd like to join. Not sure when.

view this post on Zulip Oliver Bertuch (Feb 25 2026 at 16:44):

Sure. Today would be a good fit for that.

view this post on Zulip Philip Durbin πŸš€ (Feb 26 2026 at 14:01):

I made those changes for you :heart:

view this post on Zulip Oliver Bertuch (Feb 26 2026 at 14:03):

Let me take another look. Posters are printed. :smile:

view this post on Zulip Oliver Bertuch (Feb 26 2026 at 16:08):

Done.

view this post on Zulip Philip Durbin πŸš€ (Feb 26 2026 at 16:23):

Thanks!! :dataverse_man:

view this post on Zulip Oliver Bertuch (Feb 26 2026 at 16:24):

I did my best to reply to open questions and found a few other issues.

view this post on Zulip Oliver Bertuch (Feb 26 2026 at 16:24):

The most important one is certainly the missing equals() and hashCode() on DatasetType.

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 20:54):

@Oliver Bertuch is just id and name enough? This is what VS Code is suggesting:

Screenshot 2026-03-02 at 3.42.27β€―PM.png

If I check just id and name, it shows generates this:

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((id == null) ? 0 : id.hashCode());
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        DatasetType other = (DatasetType) obj;
        if (id == null) {
            if (other.id != null)
                return false;
        } else if (!id.equals(other.id))
            return false;
        if (name == null) {
            if (other.name != null)
                return false;
        } else if (!name.equals(other.name))
            return false;
        return true;
    }

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 20:57):

Anyway, pushed: https://github.com/IQSS/dataverse/pull/11753#discussion_r2874633026

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 21:50):

And this: https://github.com/IQSS/dataverse/pull/11753/changes#r2874761915

view this post on Zulip Oliver Bertuch (Mar 02 2026 at 21:52):

Sorry I was out with @Jan Range in Stuttgart. We had a fab diner at a local Brauhaus!

view this post on Zulip Oliver Bertuch (Mar 02 2026 at 21:52):

Do you still want input?

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 21:52):

For the other stuff, @Gustavo Durand @Juan Pablo Tosca Villanueva and I decided we'd punt. I'm going to create a tech debt issue for dataset types. Maybe you can help edit the description, @Oliver Bertuch, and we'll decide on the scope and size of the next push, whenever it is.

And I'm still open to adding small things to the PR we're working on now.

view this post on Zulip Oliver Bertuch (Mar 02 2026 at 21:53):

Sounds good to me! I'm just the messenger pointing at things, you decide the course, cap'n :man_rowing_boat:

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 21:54):

Ha. I'll let you know when I've created it.

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 21:54):

I literally wrote "create issue about corners that were cut" :sweat_smile:

view this post on Zulip Philip Durbin πŸš€ (Mar 02 2026 at 21:55):

But on the plus side, hopefully we'll get #11753 merged in time for 6.10.

view this post on Zulip Philip Durbin πŸš€ (Mar 03 2026 at 19:23):

@Oliver Bertuch I'm clicking "re-request review" on https://github.com/IQSS/dataverse/pull/11753

Screenshot 2026-03-03 at 2.22.59β€―PM.png

I don't want this to be a blocker for merging. (I know that worst case, we can always dismiss your review.)

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:42):

@Oliver Bertuch hi, just checking in. Merging is blocked:

Screenshot 2026-03-04 at 7.41.25β€―AM.png

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:50):

Just today I had a request by a researcher asking for Instruments support. Any objections to add those? https://support.datacite.org/docs/can-i-register-a-doi-for-an-instrument

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:51):

(Not as a part of this PR)

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:52):

Any types can be added freely.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:53):

Just cook up a JSON file.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:53):

But we'd need to extend the enum for resourceTypeGeneral, don't we?

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:53):

Well, sure.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:53):

Like you did for reviews

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:54):

If you want it to be sent to DataCite or show up in the CSL citation export.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:54):

Yeah, I'd like that. DataCite has native support for Instruments these days and it would be great to keep the data and instruments in the same place.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:56):

Sure. Future PR, though, please! :sweat_smile: Code freeze is tomorrow.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:56):

I think I said that above :wink:

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:56):

So we're trying to get this one tested and merged.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:58):

Are you still thinking about the "merging is blocked" question above? :heart:

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:58):

You sort of avoided talking about it. :smile:

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:58):

Sorry at a conference.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:58):

I am just doing the review in parallel to a session

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 12:58):

Worst case we'll dismiss your review so we can merge.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 12:58):

(As you may have noticed)

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 13:02):

I just revisited my main concerns and find them all covered now. Approved. :check: Thanks for working on this and bearing with my lengthy remarks.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 13:40):

Sure! To be honest, I need to do a little more testing, so I'll be back on that branch when I get to work.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 13:40):

So if there's anything small you want, lemme know.

view this post on Zulip Juan (Mar 04 2026 at 13:53):

Philip Durbin πŸš€ said:

Sure. Future PR, though, please! :sweat_smile: Code freeze is tomorrow.

In Spain -and I’ve heard also in Latin America, and I think in France as well-, we are receiving petitions to upload Data Management Plans to our Dataverse instances.

Could Data Management Plan be considered a new dataSetType? Should I open a new PR?

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 14:30):

@Juan interesting! I'm think we should have a different topic about DMPs. Do you want to create one? Maybe in #community? I'm sure @Viviane Veiga and many others will be interested. By the way, in #12000 we're adding this:

As shown in a video <https://www.researchspace.com/blog/dmptool-rspace-eln-dataverse-integration>, a Data Management Plan (DPM) can be added into RSpace and the research records and associated data can then be sent to Dataverse. Dataverse generates a Persistent Identifier (PID, often a DOI) for the dataset, and RSpace automatically puts the PID link under "Research Outputs" in the DPM.

(Thanks to @Tilo Mathes and @Vaida PlankytΔ— 🎨 and the rest of RSpace for that integration!)

view this post on Zulip Juan (Mar 04 2026 at 14:42):

Thanks @Philip Durbin πŸš€, I will watch the video and open the new topic tomorrow. I have meetings for the rest of my workday.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:04):

@Juan @Philip Durbin πŸš€ DataCite has support for a "OutputManagementPlan" type these days.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 15:04):

I saw that. Interesting.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:05):

@Philip Durbin πŸš€ should we discuss in a new GH issue if the resource type general should be a configurable type and not hardcoded?

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 15:06):

Or maybe a design doc? I started this one called "Resource Types" back in October: https://docs.google.com/document/d/1xUbOj0vQ629_YF0nMM3OUnfjY4QBw3WVsD_MnrMMBYI/edit?usp=sharing

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:06):

With output management plans comes to mind that there are DMPs and SMPs and maybe more in the future.

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:06):

Sure.

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 15:06):

(I don't find GitHub issues to be great for long discussions.)

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:06):

It would be good to collect use cases and think some more about where we wanna go with this

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:07):

Should we keep the existing design doc and add another chapter/part to it?

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 15:08):

Well, remember we have an even earlier design doc: Proposal: Supporting Multiple Dataset Types in Dataverse - https://docs.google.com/document/d/16RvGXmaPQK9DGsEEbrrFEu8mjUrN93YY_yh7ZONyMDI/edit?usp=sharing

view this post on Zulip Oliver Bertuch (Mar 04 2026 at 15:09):

Oh so we already have multiple. We can always link between them for tracking the history

view this post on Zulip Philip Durbin πŸš€ (Mar 04 2026 at 15:09):

I'm fine with whatever. New doc. New Zulip topic. A meeting. Anything but trying to have a discussion in a GitHub issue. :smile:

view this post on Zulip Notification Bot (Mar 04 2026 at 15:17):

5 messages were moved from this topic to #dev > Architecture Decision Records (ADRs) by Oliver Bertuch.

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 09:06):

So sorry @Philip Durbin πŸš€ but I did find a small thing. :see_no_evil: Stumbled over this just this morning while looking for other things in the DataCite rTG vocabulary. https://github.com/IQSS/dataverse/pull/11753/changes#r2888545716

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 13:13):

@Oliver Bertuch thanks, I just replied

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 14:05):

Want me to add a comment in the code?

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 15:23):

That may be good, as others may have the same question. Should this go into the docs as well?

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 15:32):

Maybe. Where?

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 16:00):

On the train ride home now. Let me take a look.

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 16:19):

I left a few (three) suggestions on the user doc.

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:07):

@Oliver Bertuch I don't suppose you have time for a call

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:07):

The name is "dataset" and the displayName is "Dataset". Maybe I can edit your suggestion to make this clear.

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 17:08):

In a public no-phone environment at the moment... Sorry. Will be back to desk later if that helps.

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 17:08):

I was suggesting to use "Dataset" as this is the user documentation and users will see "Dataset" in the UI

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 17:08):

Feel free to ignore any suggestions that don't resonate with you

view this post on Zulip Oliver Bertuch (Mar 05 2026 at 17:08):

I don't feel strongly about any of this.

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:09):

Actually, since that suggestion isn't in the API Guide, I'll take it.

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:14):

Actually, I took it but I'm going to change it back to lower case "dataset" because that's what is shown in the search facet (as of this PR).

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:26):

Ok, I think I'm done making changes: https://github.com/IQSS/dataverse/pull/11753/commits

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:27):

Juan said:

Thanks Philip Durbin πŸš€, I will watch the video and open the new topic tomorrow. I have meetings for the rest of my workday.

@Juan thanks for kicking off #community > A new supported datasetType for Data Management Plans? !

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 17:29):

@Juan since you're obviously using dataset types (or plan to), I wanted to give you (and others) a heads that in #11753 the following backward-incompatible change is coming:

In previous releases of Dataverse, as soon as additional dataset types were added (such as "software", "workflow", etc.), they could be used by all users when creating datasets (via API only). As of this release, on a per-collection basis, superusers must allow these dataset types to be used. See #12115 and #11753.

view this post on Zulip Juan (Mar 05 2026 at 20:22):

Thanks @Philip Durbin πŸš€ , We will migrate to v6.10 before starting to use dataset types.

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:24):

Ok, phew :relieved:

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:24):

Longer term, maybe dataset types should turn into resource types. I don't know. :shrugdog:

It would be an architectural change. I gave you comment access on the Resource Types doc I mentioned above. But that doc need a lot more work.

view this post on Zulip Juan (Mar 05 2026 at 20:27):

Perfect. Thanks

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:28):

Have you played with Zenodo? They allow you to create a bunch of different resource types.

view this post on Zulip Juan (Mar 05 2026 at 20:30):

A little bit, but we like much more Dataverse :smile: . But we will need to add at the moment DMPs, and software. Perhaps another resource types in the future.

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:43):

Here's the Zenodo UI:

zenodo.png

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:50):

The OSF UI:

osf.png

view this post on Zulip Juan (Mar 05 2026 at 20:50):

It seems that they do not have a specific resource type for DMPs. Even the DMPs imported from Argos have "Other" resource type: https://zenodo.org/records/3980771 . Argos and Zenodo are OpenAIRE tools.

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:53):

They both seem to have the option for Output Management Plan (which, if you squint, might look like a DMP :smile: ):

omp.png

view this post on Zulip Philip Durbin πŸš€ (Mar 05 2026 at 20:54):

Any objections if I move all these messages into the main topic (for dataset types)?

Update: moved! Sorry if it's messy! I moved the messages about potential DMP and instrument dataset types here.

view this post on Zulip Juan (Mar 05 2026 at 20:55):

Yes, you are absolutely right. I didn't find it under publication.

view this post on Zulip Philip Durbin πŸš€ (Mar 06 2026 at 15:31):

@Oliver Bertuch @Gustavo Durand I just posted more about nullable true vs false for displayName at https://github.com/IQSS/dataverse/pull/11753/changes#r2788164627

view this post on Zulip Philip Durbin πŸš€ (Mar 06 2026 at 15:32):

I might take another run at making displayName nullable=false. Wish me luck!

view this post on Zulip Philip Durbin πŸš€ (Mar 06 2026 at 16:07):

Wait, wait, it's already nullable=false:

@Column(nullable = false, columnDefinition = "VARCHAR(255) DEFAULT ''")
private String displayName;

@Oliver Bertuch is there anything to fix here? Or in the following?

-- Add displayname column to datasettype table
ALTER TABLE datasettype ADD COLUMN IF NOT EXISTS displayname VARCHAR(255) NOT NULL DEFAULT '';
-- Populate displayname with name but capitalize it (name=dataset becomes displayname=Dataset)
UPDATE datasettype SET displayname = CONCAT(UPPER(SUBSTRING(name, 1, 1)), SUBSTRING(name, 2));

Last updated: Apr 03 2026 at 06:08 UTC