Hello,
I'm trying to understand how Dataverse manages metadata. I understand at a high and maybe medium level about metadata blocks and how to add custom ones.
I'm trying to figure out how the system thinks about metadata it distributes when exporting via the UI or harvesting. For example, when you export metadata, say in DDI format, will I only get the fields that are part of DDI across all metadata blocks? Or in DC, will I only get the field if it has a DC termURL in the tsv file? Basically I'm trying to figure out how the system picks which fields to export, per export format.
We've enabled a OAI-PMH harvesting client and are using it to harvest oai_ddi format. What I understand from that is that we'd get DDI fields. What I don't understand about that is what fields exactly do we get from from the server we are harvesting from?
Sorry if this is a basic question. I'm new to DDI - so I may have some basic misunderstandings there. Thanks, in advance, for any help.
As you've noticed, you have a variety of formats to choose from when setting up a harvesting client.
DDI is a fine choice, much richer than DC.
You can pick Dataverse's native JSON format if you happen to be harvesting from another installation of Dataverse. It's the most complete harvesting format, and will include custom metadata blocks, but you have to have Dataverse on each side.
To get a sense of which DDI fields will be imported and where, I'd suggest looking at the crosswalk linked from https://guides.dataverse.org/en/6.3/user/appendix.html
@Julian Gautier maintains that crosswalk.
Thanks, Phil. So if we used, just for sake of example, DC, then the system would use this crosswalk to put together the record it would export?
Does this mean that if you create a custom metadata block and introduced new DC fields to the system, that the only export these fields would show up in is the JSON one?
Well, when you harvest, you only get what we call a "search card" in your system. I don't think you can export it.
The idea is that the dataset is discoverable in your system but when someone clicks, they are sent to the original source.
That makes sense. We were just curious about which fields were harvested. All the DDI fields, or a subset?
Well, the DDI fields have to be filled in, but, yes, all of them. :grinning:
I think! Again, I guess I'd refer you to that crosswalk doc.
Bethany Seeger said:
Does this mean that if you create a custom metadata block and introduced new DC fields to the system
I'm a little confused by this. Are we missing some DC fields? I think there are only 15. I thought we had them all already.
No I'm sure I just picked a bad example. :)
But yes, from a harvesting perspective only Dataverse's native JSON contains custom metadata blocks.
I believe OAI_ORE is also complete in this sense but last I checked it can't be used for harvesting.
Thanks, Phil. I appreciate your help.
Sure. Good luck!
Last updated: Nov 01 2025 at 14:11 UTC