This is a great option:
https://guides.dataverse.org/en/latest/api/native-api.html#view-dataset-files-and-folders-as-a-directory-index
and works when using (viewing) in a browser. Typing this URL in a browser window, I can navigate see the directory structure of my dataset and download individual files:
https://demo.dataverse.org/api/datasets/:persistentId/dirindex?persistentId=doi:10.70122/FK2/92XEEU
BUT... of course there is a but...
when I try to download using the wget command line for crawling (“recursive downloading”) as specified in the docs, the files (directories) are not saved correctly - in the right folder structure.
Here's the command I used, taken from the above API guide:
wget -r -e robots=off -nH --cut-dirs=3 --content-disposition https://demo.dataverse.org/api/datasets/:persistentId/dirindex?persistentId=doi:10.70122/FK2/92XEEU
Am I missing an option?
Thanks.
I'm pretty sure it's a known issue:
wget command to download all files in a dataset not preserving file hierarchy #8836
Thanks @Philip Durbin I didn't look that far down in issues - that is from 2022.
It doesn't work on demo.dataverse either.... is that local or S3.
Doesn't work on our S3 box at UVa.
Oh, I thought it didn't work at all. Are we thinking it might work for non-S3?
I think demo is a mix of S3 and local (per collection) but I'm not sure.
Meghan's "Other notes" seems to imply it does work on "local", but not sure what "S3 emulation" means, this was in V5.8.3:
Based on our testing, we are wondering if this is related to the file system. When using a local machine with a standard file system, the command seems to work. Borealis uses S3 emulation.
Oh, I see. Good reading comprehension. :grinning: I missed that.
Sherry Lake has marked this topic as resolved.
Last updated: Oct 30 2025 at 06:21 UTC