Stream: large-data

Topic: Filesystem -> S3 migration documentation still valid?


view this post on Zulip mjlassila (Sep 04 2025 at 09:23):

I noticed that to get filesystem to S3 migration work, file:// prefix in dvobject.storageidentifier needs to be replaced to configured storage ID instead of s3://as currently instructed in https://guides.dataverse.org/en/latest/developers/deployment.html . So, the protocol part in dvobject.storageidentifierwill look like <storage_id>://

We are using CEPH S3 instead of Amazon S3, so perhaps this is related to that (?). Dataverse version is 6.7.1.

view this post on Zulip Philip Durbin πŸš€ (Sep 04 2025 at 11:07):

Hmm. It looks like @Don Sizemore added those docs in #6789. Any comment, Don?

view this post on Zulip Don Sizemore (Sep 04 2025 at 11:23):

the prefix you're seeing isn't the protocol but the identifier of the datastore in use. you're essentially copying the files outside of Dataverse, then updating the location pointer within the database.

view this post on Zulip mjlassila (Sep 04 2025 at 12:11):

Would it make sense to change the example SQL update queries so they don’t use s3:// as the datastore identifier, and use something like <datastore_identifier>:// instead?

view this post on Zulip Don Sizemore (Sep 04 2025 at 12:25):

absolutely. I think what you're seeing is that, in the early days, everybody called theirs s3 :smile:

view this post on Zulip Philip Durbin πŸš€ (Sep 04 2025 at 15:08):

@mjlassila great idea. If you'd like to make a pull request, I'm happy to help!

view this post on Zulip mjlassila (Sep 08 2025 at 10:42):

I made a pull request, I'm happy to tweak it more if it needs improvement: https://github.com/IQSS/dataverse/pull/11795

view this post on Zulip Philip Durbin πŸš€ (Sep 08 2025 at 11:47):

Looks great! Merged! Thanks, @mjlassila! :dataverse_man:


Last updated: Nov 01 2025 at 14:11 UTC