Stream: troubleshooting

Topic: trusted remote storage setup


view this post on Zulip jamie jamison (Dec 11 2023 at 19:34):

This my continuing issues setting up trusted remote storage. Originally I tried a test on google drive (which I couldn't get working) but have had to switch to trying on either/both box and an S3 bucket.
The problem I have seems to be setting up the identifier.
box example:
-Ddataverse.files.lariac-remote.label=LARIAC-remote
-Ddataverse.files.lariac-remote.download-redirect=true
-Ddataverse.files.lariac-remote.public=true
-Ddataverse.files.lariac-remote.type=remote
-Ddataverse.files.lariac-remote.base-url=https://ucla.box.com
-Ddataverse.files.lariac-remote.base-store=file
-Ddataverse.files.lariac-remote.upload-redirect=true

The JSON_Data file:
export API_TOKEN=****
export SERVER_URL=https://dataverse.ucla.edu
export PERSISTENT_ID=doi:10.25346/S6/UEPJMA
export JSON_DATA="{"description":"Remote LARIAC image test", "storageIdentifier": "LARIAC-remote://s/xv53zxxsovt3mmicae9fshnk3ljycvcp","label": "nlcd_la.tif", "fileName": "nlcd_la.tif", "mimeType": "image/tiff", "checksum": {"@type": "MD5", "@value": "*****" }}"

Thank you, Jamie

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 19:37):

Hmm, does removing the dash help? And what error do you get?

view this post on Zulip jamie jamison (Dec 11 2023 at 19:43):

Sorry, should have added that.
-bash: export: `test, storageIdentifier: LARIAC-remote://s/xv53zxxsovt3mmicae9fshnk3ljycvcp,label: nlcd_la.tif, fileName: nlcd_la.tif, mimeType: image/tiff, checksum: {@type: MD5, @value: 8d0e37b3d0cb048c2ba4ae68c31e0c1b }}': not a valid identifier

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 19:47):

That "not a valid identifer" is coming from bash.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 19:47):

I doesn't like what you've tried to assign to JSON_DATA (I think).

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 19:47):

what if you do this?

echo $JSON_DATA

view this post on Zulip jamie jamison (Dec 11 2023 at 19:49):

echo $JSON_DATA
{description:Remote

So I guess my json still isn't correct.
I've tried to follow the Dataverse documentation example but obviously I'm getting something wrong.

view this post on Zulip jamie jamison (Dec 11 2023 at 19:50):

I've been typeing my code on the wndows site and pasting over to linux. Maybe need to actually type everything on the linux side.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 19:52):

Well, the example in the guides might be wrong. You're welcome to create an issue if it is.

Stuffing JSON in a bash variable can be tricky. For starters I would surround the JSON with single quotes instead of double quotes.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 19:53):

export FOO='{"foo":"bar"}'

view this post on Zulip jamie jamison (Dec 11 2023 at 20:00):

Still not working - maybe I should set the variables individually instead of a json file
export JSON_DATA="{'description':'Remote LARIAC image test', 'storageIdentifier': 'lariac-remote-test://s/xv53zxxsovt3mmicae9fshnk3ljycvcp','label': 'nlcd_la.tif', 'fileName': 'nlcd_la.tif', 'mimeType': 'image/tiff', 'checksum': {'@type': 'MD5', '@value': '*' }}โ€

"
[ec2-user@ip-172-31-17-169 ~]$ echo $JSON_DATA
{'description':'Remote LARIAC image test', 'storageIdentifier': 'lariac-remote-test://s/xv53zxxsovt3mmicae9fshnk3ljycvcp','label': 'nlcd_la.tif', 'fileName': 'nlcd_la.tif', 'mimeType': 'image/tiff', 'checksum': {'@type': 'MD5', '@value': '****' }}โ€
[ec2-user@ip-172-31-17-169 ~]$ curl -H "X-Dataverse-key: $API_TOKEN" -X POST "$SERVER_URL/api/datasets/:persistentId/add?persistentId=$PERSISTENT_ID" -F "jsonData=$JSON_DATA"
{"status":"ERROR","message":"Error in parsing provided json"}[ec2-user@ip-172-31-17-169 ~]$

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 20:01):

Can you please link me to the place in the guides you're looking at?

view this post on Zulip jamie jamison (Dec 11 2023 at 20:03):

Here is the link: https://guides.dataverse.org/en/latest/developers/big-data-support.html?highlight=trusted%20remote#trusted-remote-storage-with-the-remote-store-type

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 20:05):

Bleh, that example JSON will never work:

export JSON_DATA="{'description':'My description.','directoryLabel':'data/subdir1','categories':['Data'], 'restrict':'false', 'storageIdentifier':'trs://images/dataverse_project_logo.svg', 'fileName':'dataverse_logo.svg', 'mimeType':'image/svg+xml', 'checksum': {'@type': 'SHA-1', '@value': '123456'}}"

JSON uses double quotes, not single quotes. Do you want to open an issue for this?

view this post on Zulip jamie jamison (Dec 11 2023 at 20:06):

Ok, will open an issue but should everything have double quotes?

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 20:06):

Here, this version passes through jq at least:

export JSON_DATA='{"description":"My description.","directoryLabel":"data/subdir1","categories":["Data"], "restrict":"false", "storageIdentifier":"trs://images/dataverse_project_logo.svg", "fileName":"dataverse_logo.svg", "mimeType":"image/svg+xml", "checksum": {"@type": "SHA-1", "@value": "123456"}}'

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 20:06):

Maybe you can use that as a starting point.

view this post on Zulip jamie jamison (Dec 11 2023 at 20:08):

and I'll test that out now and open the issue

view this post on Zulip jamie jamison (Dec 11 2023 at 20:22):

I have a new error but I think the json is now correct though I probably have to correct the jvm options.
echo $JSON_DATA
{"description":"Remote LARIAC image test", "storageIdentifier": "lariac-remote-test://s/xv53zxxsovt3mmicae9fshnk3ljycvcp","label": "nlcd_la.tif", "fileName": "nlcd_la.tif", "mimeType": "image/tiff", "checksum": {"@type": "MD5", "@value": "8d0e37b3d0cb048c2ba4ae68c31e0c1b" }}
[ec2-user@ip-172-31-17-169 ~]$ export JSON_DATA='{"description":"Remote LARIAC image test", "storageIdentifier": "lariac-remote-test://s/xv53zxxsovt3mmicae9fshnk3ljycvcp","label": "nlcd_la.tif", "fileName": "nlcd_la.tif", "mimeType": "image/tiff", "checksum": {"@type": "MD5", "@value": "****" }}'
[ec2-user@ip-172-31-17-169 ~]$ curl -H "X-Dataverse-key: $API_TOKEN" -X POST "$SERVER_URL/api/datasets/:persistentId/add?persistentId=$PERSISTENT_ID" -F "jsonData=$JSON_DATA"
{"status":"ERROR","message":"Dataset store configuration does not allow provided storageIdentifier."}[ec2-user@ip-172-31-17-169 ~]$

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 20:24):

You're on 5.14, right? That error gets printed in three different places (unfortunately):

src/main/java/edu/harvard/iq/dataverse/datasetutility/AddReplaceFileHelper.java
2066:                                addErrorSevere("Dataset store configuration does not allow provided storageIdentifier.");
2230:                                addErrorSevere("Dataset store configuration does not allow provided storageIdentifier.");

src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
2471:                            "Dataset store configuration does not allow provided storageIdentifier.")

view this post on Zulip jamie jamison (Dec 11 2023 at 20:26):

Looks like the error is how I setup the datastore. I'll go back over documentation. The examples seems to assume the storage is on a web site. I'm trying to see how that translates to departmental box.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 11 2023 at 20:27):

Ok, good luck! I'd probably add 1, 2, and 3 to each of the errors above to try to figure out which one is being triggered.

view this post on Zulip jamie jamison (Dec 11 2023 at 20:28):

will do, thank you

view this post on Zulip jamie jamison (Dec 12 2023 at 02:26):

I am trying to figure out where the errors are coming from. But, from a beginner perspective I'm not sure what the configuration should look like. There is only the one example in the documentation. Not sure what an S3 example would look like. I should probably also get on slack and see if I can find someone who has successfully set this up.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 12 2023 at 11:53):

Or you could ask on https://groups.google.com/g/dataverse-community of course.

view this post on Zulip Philip Durbin ๐Ÿš€ (Dec 12 2023 at 11:53):

It's been a while since I set it up.


Last updated: Oct 30 2025 at 06:21 UTC