I am trying to find a way to get the internal dataset id for a dataset with many files (170k), the PID is known.
Search API finishes in about two minutes, and does not return the dataset ID. (/api/search?q=PID)
The native API times out, because it seems to try to serialize all file metadata. (/api/datasets/:persistentId/?persistentId=PID)
So can either the dataset id be gotten from the search api somehow, or the files be excluded from the dataset listing in the native API?
I would suggest setting show_entity_ids=true when using the Search API: https://guides.dataverse.org/en/6.4/api/search.html
Thanks, this seems to work, even if it is quite slow.
Is something like this faster? https://dataverse.harvard.edu/api/datasets/:persistentId/versions/:latest-published?persistentId=doi:10.7910/DVN/TJCLKP&excludeFiles=false
I tried with both excludeFiles=true and excludeFiles=false but they do not work. Most probably because our dataverse is still on 6.1 :( . But I will keep that in mind in case we upgrade.
Oh, right. I forget when it was added. Pretty recently, for the SPA, because it was slow to retrieve that data otherwise.
It looks like it was added in 6.2 (renamed at least): https://guides.dataverse.org/en/6.4/api/changelog.html#v6-2
Renamed in PR #10191.
Anyway, @Péter Pallinger I guess we could add a new API that takes a DOI/PID and simply returns the database id. If you want something like this, please feel free to open an issue.
Last updated: Nov 01 2025 at 14:11 UTC