Hello :) I have been contacted by @Aamir Muhammad on CVOC subject and he would like to get a first experience under docker. Thought it would be great to have a basic starter kit, so here is a procedure to enable demo files of https://github.com/gdcc/dataverse-external-vocab-support project :
# usual docker start
mvn -Pct clean package docker:run
# add cvocdemo.tsv
wget https://raw.githubusercontent.com/gdcc/dataverse-external-vocab-support/main/examples/metadatablocks/cvocdemo.tsv -O /tmp/cvocdemo.tsv
docker cp /tmp/cvocdemo.tsv dataverse-1:/opt/payara/file.tsv
docker exec dataverse-1 curl http://localhost:8080/api/admin/datasetfield/load -H "Content-type: text/tab-separated-values" -X POST --data-binary @/opt/payara/file.tsv
# http://localhost:8080/dataverseuser.xhtml?selectTab=apiTokenTab to get API TOKEN
curl -H "X-Dataverse-key:$API_TOKEN" -X POST -H "Content-type:application/json" -d "[\"citation\",\"extVocabDemoMetadata\"]" http://localhost:8080/api/dataverses/:root/metadatablocks
# update solr schema via dataverse api
docker exec dataverse-1 wget https://raw.githubusercontent.com/IQSS/dataverse/develop/conf/solr/9.3.0/update-fields.sh -O /opt/payara/update-fields.sh
docker exec dataverse-1 wget https://raw.githubusercontent.com/IQSS/dataverse/develop/conf/solr/9.3.0/schema.xml -O /opt/payara/schema.xml
docker exec dataverse-1 chmod u+x /opt/payara/update-fields.sh
docker exec dataverse-1 wget http://localhost:8080/api/admin/index/solr/schema -O /opt/payara/partial-schema.xml
docker exec --user root dataverse-1 apt install bc ed
docker exec dataverse-1 /opt/payara/update-fields.sh /opt/payara/schema.xml /opt/payara/partial-schema.xml
# reload solr with schema
docker cp dataverse-1:/opt/payara/schema.xml /tmp/schema.xml
docker cp /tmp/schema.xml solr-1:/var/solr/data/collection1/conf/
docker exec solr-1 curl "http://localhost:8983/solr/admin/cores?action=RELOAD&core=collection1"
# deploy demo cvoc conf
wget https://raw.githubusercontent.com/gdcc/dataverse-external-vocab-support/main/examples/config/cvoc-conf.json -O /tmp/cvoc-conf.json
curl -X PUT --upload-file /tmp/cvoc-conf.json http://localhost:8080/api/admin/settings/:CVocConf
After that, you can edit a dataset and see the new metadata block configured :
Screenshot-from-2024-04-11-09-34-18.png
Thanks to @Jérôme Roucou for the script.
Thanks @luddaniel and @Jérôme Roucou for putting in such effort to help.
Some interesting ideas here!
Please note that the update-fields script is already included in the configbaker image
Instead of copying around the files it might be easier to mount the conf dir when using docker run with configbaker
IIRC when using the dev persona the API is not locked down, so there should be no need to run the curl commands inside the app container
If you would like to do this in production, you should take a look at the demo persona. The admin api will require an unblock key to be reachable from outside of the container
Glad to see our upstream containers work for you! :smile:
Thanks @Oliver Bertuch I think it would be great to a couple of scripts to fastly configure a docker (adding language with package, adding/updating tsv, reload schema) as when we switch from PR to PR we kind of want to start from scratch and re-create a context
I am doing this on my local computer. I am very new to these technologies. we don't have a production instance. I am trying to work with external vocabulary and understand how it integrates to dataverse.
Absolutely! That's why we have configbaker! :smile: Back in the day when we wrote our roadmap we already had on our radar to make the existing scripts more modular, so they can be easily reused as building blocks. For example we load a few metadata blocks by default, so this should be a small snippet, easy to run on its own as part of a customized bootstrapping or standalone execution
At some point we were also talking about including a flavor that simply is an nginx to serve static files like the vocab scripts and previewers. That way you could simply serve all of the stuff in one go, also ready for further development
Maybe you want to add this to the agenda of the container WG meeting later today? See #containers > weekly meeting
You or @Jérôme Roucou could even give a little demo and maybe we can talk about how this can be improved and generalised for further use?
I'll talk with the team ;)
Sure, you're all very welcome to come to our weekly containerization working group meeting. :grinning: https://ct.gdcc.io
I'm thinking about loading a metadata block and updating Solr over at #containers > update-fields.sh
Last updated: Nov 01 2025 at 14:11 UTC