Stream: large-data

Topic: ✔ trusted remote storage


view this post on Zulip Lincoln (Oct 20 2025 at 10:48):

Hi I have been tryin with Trusted remote storage for a while and kind of in a situation what i could do further
a) Testing with this file : https://svn.vsp.tu-berlin.de/repos/public-svn/matsim/scenarios/countries/de/berlin/berlin-v6.4/output/berlin-v6.4-10pct/berlin-v6.4.scorestats.png
b) my config on domain.xml is

<jvm-options>-Ddataverse.files.dl.type=remote</jvm-options>
       <jvm-options>-Ddataverse.files.dl.label=datalake</jvm-options>
       <jvm-options>-Ddataverse.files.dl.base-url=https://svn.vsp.tu-berlin.de/repos/public-svn</jvm-options>
       <jvm-options>-Ddataverse.files.dl.base-store=s3</jvm-options>
       <jvm-options>-Ddataverse.files.dl.download-redirect=true</jvm-options>
       <jvm-options>-Ddataverse.files.dl.public=true</jvm-options>

c) and when i use the command i get this error

curl -H "X-Dataverse-key: xxxxxxxxxx"      -X POST      "http://xxxxxxx:xxx/api/datasets/:persistentId
/add?persistentId=doi:10.5072/FK2/ZQBLX7"      -F 'jsonData={"description":"Berlin v6.4 10% score statistics plot","storageIdentifier":"dl://matsim/scenarios/countri
es/de/berlin/berlin-v6.4/output/berlin-v6.4-10pct/berlin-v6.4.scorestats.png","md5Hash":"4f8299b5f5bb324b4e4a9f7a07c3d0b8","label":"berlin-v6.4.scorestats.png","file
Name":"berlin-v6.4.scorestats.png","mimeType":"image/png", "checksumType":"MD5"}'

{"status":"ERROR","message":"Dataset store configuration does not allow provided storageIdentifier."}

d) from payara serverlog there is nothing

jsonData: {"description":"Berlin v6.4 10% score statistics plot","storageIdentifier":"https://svn.vsp.tu-berlin.de/repos/public-svn/matsim/scenarios/countries/de/
berlin/berlin-v6.4/output/berlin-v6.4-10pct/berlin-v6.4.scorestats.png","md5Hash":"4f8299b5f5bb324b4e4a9f7a07c3d0b8","label":"berlin-v6.4.scorestats.png","fileName"
:"berlin-v6.4.scorestats.png","mimeType":"image/png"}]]

5)’ve selected the “datalake” option as the trusted storage label in Dataverse (used 6.7.1)
.

Would really appreciate any insights or suggestions from others who’ve worked with remote storage setups like this.

view this post on Zulip Philip Durbin 🚀 (Oct 20 2025 at 13:30):

@Lincoln could you please start a thread over at https://groups.google.com/g/dataverse-community ? Jim Myers knows the most about file stores and will probably see it there.

view this post on Zulip Philip Durbin 🚀 (Oct 20 2025 at 13:31):

Meanwhile, as a test perhaps you could try to configure a dataset to use that store: https://guides.dataverse.org/en/6.7.1/admin/dataverses-datasets.html#configure-a-dataset-to-store-all-new-files-in-a-specific-file-store

view this post on Zulip Philip Durbin 🚀 (Oct 20 2025 at 13:31):

(This shouldn't be necessary since you've already configured the parent collection.)

view this post on Zulip Lincoln (Oct 20 2025 at 13:33):

@Philip Durbin 🚀 nice to hear from you.
I'll start a thread on google groups :)

view this post on Zulip Philip Durbin 🚀 (Oct 20 2025 at 13:37):

https://groups.google.com/g/dataverse-community/c/yMQ9rFZEs28/m/jaqKdpEtAQAJ looks great. Thanks!

view this post on Zulip Lincoln (Oct 20 2025 at 15:08):

adding the two commands worked
-Ddataverse.files.dl.upload-out-of-band=true
-Ddataverse.files.dl.reference-endpoints-with-basepaths=https://svn.vsp.tu-berlin.de/repos/public-svn

Thank you @Philip Durbin 🚀

view this post on Zulip Notification Bot (Oct 20 2025 at 15:09):

Lincoln has marked this topic as resolved.

view this post on Zulip Philip Durbin 🚀 (Oct 20 2025 at 16:02):

@Lincoln I'm glad it worked! Do you think we should add some documentation? :thinking:

view this post on Zulip Lincoln (Oct 20 2025 at 16:04):

Yes, I believe this could definitely benefit others

PS: Just created a Github-issue as suggested by Jim

view this post on Zulip Philip Durbin 🚀 (Oct 20 2025 at 16:05):

Ah, this, thanks!

[Documentation] Direct-upload on Trusted Remote Storage #11911


Last updated: Nov 01 2025 at 14:11 UTC