Good morning! I have deployed a test instance of Dataverse for the Barcelona Super Computer center. Some users are asking about the best way to upload data. For large files is there a recommended API? (Users are asking if PUT can be used instead of post in order to allow chunking of the transfer. I havent found much about it in the documentation). Am I missing another way of chunking large files ? Also for parrelel transfers are there any recomendations ?
Hi! Are you using S3-compatible storage? Do you have "direct upload" enabled? If so, I'd recommend trying DVUploader: https://guides.dataverse.org/en/6.3/user/dataset-management.html#command-line-dvuploader
Philip Durbin said:
Hi! Are you using S3-compatible storage? Do you have "direct upload" enabled? If so, I'd recommend trying DVUploader: https://guides.dataverse.org/en/6.3/user/dataset-management.html#command-line-dvuploader
Hello! Thanks for the reply. At the moment no. We will potentially be using swift in the future. For the moment it is just an openstack storage volume mounted on the VM. Later it could be some combination of swift with our ibm spectrum archive storage system.
Do you have anny recommendations for best way to upload files in the current architecture ?
Ok, so no S3. No Globus either, I assume: https://guides.dataverse.org/en/6.3/admin/integrations.html#globus
For a one off large file upload to non-S3, I've been chatting with @MarΓa A. Matienzo about a way to do it over at #troubleshooting > large uploads via native api workaround
As @Don Sizemore reminded me, there's also the concept of Trusted Remote Storage: https://guides.dataverse.org/en/6.3/installation/config.html#trusted-remote-storage
In short, Dataverse doesn't manage the files. You just tell Dataverse where the files live.
Last updated: Nov 01 2025 at 14:11 UTC