Stream: troubleshooting

Topic: store config does not allow storageIdentifier


view this post on Zulip uclr17 (Apr 24 2024 at 15:19):

Jan Range said:

I receive the following message every time I try to add a remote file:

{"status":"ERROR","message":"Dataset store configuration does not allow provided storageIdentifier."}

The storage identifier follows the base URL scheme but does not match.

Did you find the proper configuration to make it work ?
I have the same error.
Thanks

view this post on Zulip Jan Range (Apr 24 2024 at 15:21):

Not yet, unfortunately. Shall we open a new topic to talk about the issue?

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 24 2024 at 15:29):

@Jan Range I moved this message to a new topic.

@uclr17 are you using pyDataverse?

view this post on Zulip uclr17 (Apr 25 2024 at 08:11):

@Jan Range thank you for your feedback

@Philip Durbin i am not using pyDataverse, i use an http client (insomnia).
The generated client code for Curl is:

curl --request POST \
--url 'http://localhost:8080/api/datasets/:persistentId/add?persistentId=perma%3A10.5072LRU%2FFSJ3BJ' \
--header 'Content-Type: multipart/form-data' \
--header 'User-Agent: insomnia/8.6.1' \
--header 'X-Dataverse-key: XXXXX' \
--form 'jsonData={"description":"My description cat.","storageIdentifier":"trsa://themes/custom/qdr/images/CoreTrustSeal-logo-transparent.png","checksumType":"MD5","md5Hash":"509ef88afa907eaf2c17c1c8d8fde77e","label":"testlogo.png","fileName":"testlogo.png","mimeType":"image/png"}'

My config in my docker compose (i follow this doc to run my local dataverse https://guides.dataverse.org/en/latest/container/running/demo.htmlย ):

JVM_ARGS: -Ddataverse.files.storage-driver-id=file
-Ddataverse.files.file.type=file
-Ddataverse.files.file.label=Filesystem
-Ddataverse.files.file.directory=${STORAGE_DIR}/store
-Ddataverse.files.trsa.type=remote
-Ddataverse.files.trsa.label=RemoteTest
-Ddataverse.files.trsa.base-url=https://qdr.syr.edu
-Ddataverse.files.trsa.base-store=file
-Ddataverse.files.trsa.download-redirect=true
-Ddataverse.files.trsa.public=true
-Ddataverse.files.trsa.ingestsizelimit=0

The error message seems to come from the DataAccess.uploadToDatasetAllowed method:
https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/dataaccess/DataAccess.java#L357
called in addFileToDataset method:
https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java#L2500

I tried all the storage services in my collection, the error message is the same.
No error message in dataverse logs, only info message:

dataverse           | [#|2024-04-25T07:38:09.854+0000|INFO|Payara 6.2023.8||_ThreadID=99;_ThreadName=http-thread-pool::http-listener-1(4);_TimeMillis=1714030689854;_LevelValue=800;|
dataverse           |   jsonData: {"description":"My description cat.","storageIdentifier":"trsa://themes/custom/qdr/images/CoreTrustSeal-logo-transparent.png","checksumType":"MD5","md5Hash":"509ef88afa907eaf2c17c1c8d8fde77e","label":"testlogo.png","fileName":"testlogo.png","mimeType":"image/png"}|#]

Thank you for the help.

view this post on Zulip Jan Range (Apr 25 2024 at 10:04):

I am unsure, but according to the documentation, the storageIdentifier URL should start with the base-url specified within the JVM Args. I tried that, but still received the same error message. Is this the intended way of doing that, @Philip Durbin ?

view this post on Zulip uclr17 (Apr 25 2024 at 11:06):

I follow the JVM options and JSON input in this test class:
https://github.com/IQSS/dataverse/blob/v6.2/src/test/java/edu/harvard/iq/dataverse/api/RemoteStoreIT.java#L48
https://github.com/IQSS/dataverse/blob/v6.2/src/test/java/edu/harvard/iq/dataverse/api/RemoteStoreIT.java#L58

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 25 2024 at 12:02):

Ok, type=remote. Hmm.

@uclr17 can you please point me to the section of the API Guide you are using? Is it https://guides.dataverse.org/en/6.2/developers/big-data-support.html#trusted-remote-storage-with-the-remote-store-type ?

view this post on Zulip uclr17 (Apr 25 2024 at 12:45):

I am using 4 sources of documentation:
https://guides.dataverse.org/en/latest/api/native-api.html#add-remote-file-api
https://guides.dataverse.org/en/6.2/developers/big-data-support.html#trusted-remote-storage-with-the-remote-store-type
https://guides.dataverse.org/en/latest/installation/config.html#trusted-remote-storage
https://github.com/IQSS/dataverse/blob/v6.2/src/test/java/edu/harvard/iq/dataverse/api/RemoteStoreIT.java

view this post on Zulip uclr17 (Apr 26 2024 at 14:18):

@Jan Range to make it work, you need to add the JVM_ARGS:
-Ddataverse.files.trsa.upload-out-of-band=true
and in the collection where your dataset will be updated, edit the storage to select the remote store (RemoteTest with my config).

Final config:

JVM_ARGS: -Ddataverse.files.storage-driver-id=file1
  -Ddataverse.files.file1.type=file
  -Ddataverse.files.file1.label=Filesystem
  -Ddataverse.files.file1.directory=${STORAGE_DIR}/store
  -Ddataverse.files.trsa.type=remote
  -Ddataverse.files.trsa.label=RemoteTest
  -Ddataverse.files.trsa.base-url=https://qdr.syr.edu
  -Ddataverse.files.trsa.base-store=file1
  -Ddataverse.files.trsa.download-redirect=true
  -Ddataverse.files.trsa.public=true
  -Ddataverse.files.trsa.ingestsizelimit=0
  -Ddataverse.files.trsa.upload-out-of-band=true

You don't need to add -Ddataverse.files.trsa.upload-out-of-band=true and select the remote store for the collection if you create with the native API a new dataset with "files" metadatas:
https://guides.dataverse.org/en/latest/api/native-api.html#submit-dataset

    "metadataBlocks": {
       ...
    },
    "files": [
      {
        "label": "cb47471c-42bd-4ad6-8268-04a0109b2b85.png",
        "restricted": false,
        "dataFile": {
          "filename": "cb47471c-42bd-4ad6-8268-04a0109b2b85.png",
          "contentType": "image/png",
          "friendlyType": "PNG Image",
          "storageIdentifier": "trsa://themes/custom/qdr/images/CoreTrustSeal-logo-transparent.png",
          "md5": "509ef88afa907eaf2c17c1c8d8fde77e",
          "checksum": {
            "type": "MD5",
            "value": "509ef88afa907eaf2c17c1c8d8fde77e"
          },
          "tabularData": false
        }
      }
    ],

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 26 2024 at 15:01):

@uclr17 you got it working? That's great!

view this post on Zulip Jan Range (Apr 26 2024 at 15:56):

@uclr17 that is great to hear! Thanks for sharing, will try to replicate it with my instance :smile:

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 26 2024 at 16:00):

Do we need more docs?

view this post on Zulip uclr17 (Apr 26 2024 at 19:06):

@Philip Durbin yes, it works after some debugging sessions to understand the DataAccess.uploadToDatasetAllowed method.

upload-out-of-bandoption is for s3 storage type:
https://guides.dataverse.org/en/latest/installation/config.html#second-configure-your-dataverse-installation-to-use-s3-storage
not for remote storage type:
https://guides.dataverse.org/en/latest/installation/config.html#trusted-remote-storage
this option is used in StorageIO.isDirectUploadEnabled method called there:
https://github.com/IQSS/dataverse/blob/77c71024deda8f32a77be64bcd5210e20a1b8f6a/src/main/java/edu/harvard/iq/dataverse/dataaccess/DataAccess.java#L378

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 26 2024 at 19:07):

Cool. Where in the docs should we add more info?

view this post on Zulip Philip Durbin ๐Ÿš€ (Apr 26 2024 at 19:08):

I know you were looking at a lot of docs! :sweat_smile:


Last updated: Oct 30 2025 at 06:21 UTC