@Philip Durbin ๐ asked me to open a topic to discuss improving "Related datasets", since it ties into a new Project "Trusted Data".
I previously wrote a feature proposal on replacing the current unstructured string fields for dataset relationships with a more structured approach. I also built a small prototype using the CVOC mechanism, which can be installed in any Dataverse instance for testing. The prototype includes autosuggest for entering related datasets within the same Dataverse, displays related datasets on the dataset page with clickable links, and automatically infers inverse relationships.
Since then, weโve also integrated and extended this in our custom UI (see screenshots). The code is still somewhat prototype/hack-ish and in internal testing, but our implementation is based on a custom datasetGroupId field. This lets us group related datasets in search results and display all related datasets (including inferred inverse and transitive relationships) on the dataset page. Dataset groups are automatically updated whenever "Related datasets" metadata changes, via a pre-publication workflow.
Screenshot from 2025-08-18 16-45-53.png
(Dataset page)
Screenshot from 2025-08-18 16-45-12.png
(Search result)
Hi! Yes, thanks for kicking off this topic! (I know there's a thread on the google group going as well.)
In short, we are playing around with having a new research object in Dataverse. For now, we're calling it a "review" and the idea is that people would be able to review datasets.
We figured we'd try building on the new-ish "dataset type" functionality and create a new type: datasetType=review. That's what #11747 is about and I have some code I'm playing around with at https://github.com/IQSS/dataverse/compare/11747-review-dataset-type
Reviews should refer to datasets so in the code you'll see I'm using the Related Dataset metadata field.
However, very quickly I was reminded that Related Dataset has only a single, unstructured field, as you say. "Primitive" we call it in the code.
So I thought I'd at least (finally) play around with the prototype at https://github.com/vera/related-datasets-cvoc
As far as actually changing in the fields in Related Dataset, yes we'd need to migrate the existing data somehow: https://github.com/vera/related-datasets-cvoc/issues/4
Overall, I like the prototype, especially how you can link to local datasets and any remote URL.
I gave @Ceilyn Boyd a demo on Friday.
I'm very interested in following this. We were talking about something like this today, where we could see harvesting data sets, and also having a "review" dataset about them, for reproducability, etc. This review dataset would essentially augment the original one by containing some files created during curation / review. The goal would be to not copy the original files, just have these extra ones available for review/reproducability.
Ah, great. @Ceilyn Boyd how do you feel about a Zulip topic on review datasets? Or maybe trusted data generally, since review datasets are a bit of an implementation detail (one of the options).
A message was moved from this topic to #dev > review datasets by Philip Durbin ๐.
We created a new topic on review datasets:
@Vera Clemens meanwhile, I briefly demo'ed your new and improved related datasets prototype to @Ceilyn Boyd @Sonia Barbosa @Julian Gautier @Ellen K and @Danny Ebanks yesterday.
Last updated: Nov 01 2025 at 14:11 UTC