Stream: community

Topic: Data upload questions


view this post on Zulip Simon Carroll (Aug 02 2024 at 08:21):

Good morning! I have deployed a test instance of Dataverse for the Barcelona Super Computer center. Some users are asking about the best way to upload data. For large files is there a recommended API? (Users are asking if PUT can be used instead of post in order to allow chunking of the transfer. I havent found much about it in the documentation). Am I missing another way of chunking large files ? Also for parrelel transfers are there any recomendations ?

view this post on Zulip Philip Durbin πŸš€ (Aug 02 2024 at 11:47):

Hi! Are you using S3-compatible storage? Do you have "direct upload" enabled? If so, I'd recommend trying DVUploader: https://guides.dataverse.org/en/6.3/user/dataset-management.html#command-line-dvuploader

view this post on Zulip Simon Carroll (Aug 02 2024 at 12:48):

Philip Durbin said:

Hi! Are you using S3-compatible storage? Do you have "direct upload" enabled? If so, I'd recommend trying DVUploader: https://guides.dataverse.org/en/6.3/user/dataset-management.html#command-line-dvuploader

Hello! Thanks for the reply. At the moment no. We will potentially be using swift in the future. For the moment it is just an openstack storage volume mounted on the VM. Later it could be some combination of swift with our ibm spectrum archive storage system.

view this post on Zulip Simon Carroll (Aug 02 2024 at 12:48):

Do you have anny recommendations for best way to upload files in the current architecture ?

view this post on Zulip Philip Durbin πŸš€ (Aug 02 2024 at 13:02):

Ok, so no S3. No Globus either, I assume: https://guides.dataverse.org/en/6.3/admin/integrations.html#globus

view this post on Zulip Philip Durbin πŸš€ (Aug 02 2024 at 13:03):

For a one off large file upload to non-S3, I've been chatting with @MarΓ­a A. Matienzo about a way to do it over at #troubleshooting > large uploads via native api workaround

view this post on Zulip Philip Durbin πŸš€ (Aug 02 2024 at 13:11):

As @Don Sizemore reminded me, there's also the concept of Trusted Remote Storage: https://guides.dataverse.org/en/6.3/installation/config.html#trusted-remote-storage

In short, Dataverse doesn't manage the files. You just tell Dataverse where the files live.


Last updated: Nov 01 2025 at 14:11 UTC