Stream: dev

Topic: Run dataverse-external-vocab-support demo under docker


view this post on Zulip luddaniel (Apr 11 2024 at 07:43):

Hello :) I have been contacted by @Aamir Muhammad on CVOC subject and he would like to get a first experience under docker. Thought it would be great to have a basic starter kit, so here is a procedure to enable demo files of https://github.com/gdcc/dataverse-external-vocab-support project :

# usual docker start
mvn -Pct clean package docker:run

# add cvocdemo.tsv
wget https://raw.githubusercontent.com/gdcc/dataverse-external-vocab-support/main/examples/metadatablocks/cvocdemo.tsv -O /tmp/cvocdemo.tsv
docker cp /tmp/cvocdemo.tsv dataverse-1:/opt/payara/file.tsv
docker exec dataverse-1 curl http://localhost:8080/api/admin/datasetfield/load -H "Content-type: text/tab-separated-values" -X POST --data-binary @/opt/payara/file.tsv

# http://localhost:8080/dataverseuser.xhtml?selectTab=apiTokenTab to get API TOKEN
curl -H "X-Dataverse-key:$API_TOKEN" -X POST -H "Content-type:application/json" -d "[\"citation\",\"extVocabDemoMetadata\"]" http://localhost:8080/api/dataverses/:root/metadatablocks

# update solr schema via dataverse api
docker exec dataverse-1 wget https://raw.githubusercontent.com/IQSS/dataverse/develop/conf/solr/9.3.0/update-fields.sh -O /opt/payara/update-fields.sh
docker exec dataverse-1 wget https://raw.githubusercontent.com/IQSS/dataverse/develop/conf/solr/9.3.0/schema.xml -O /opt/payara/schema.xml
docker exec dataverse-1 chmod u+x /opt/payara/update-fields.sh
docker exec dataverse-1 wget http://localhost:8080/api/admin/index/solr/schema -O /opt/payara/partial-schema.xml
docker exec --user root dataverse-1 apt install bc ed
docker exec dataverse-1 /opt/payara/update-fields.sh /opt/payara/schema.xml /opt/payara/partial-schema.xml

# reload solr with schema
docker cp dataverse-1:/opt/payara/schema.xml /tmp/schema.xml
docker cp /tmp/schema.xml solr-1:/var/solr/data/collection1/conf/
docker exec solr-1 curl "http://localhost:8983/solr/admin/cores?action=RELOAD&core=collection1"

# deploy demo cvoc conf
wget https://raw.githubusercontent.com/gdcc/dataverse-external-vocab-support/main/examples/config/cvoc-conf.json -O /tmp/cvoc-conf.json
curl -X PUT --upload-file /tmp/cvoc-conf.json http://localhost:8080/api/admin/settings/:CVocConf

After that, you can edit a dataset and see the new metadata block configured :
Screenshot-from-2024-04-11-09-34-18.png

Thanks to @Jérôme Roucou for the script.

view this post on Zulip Aamir Muhammad (Apr 11 2024 at 08:11):

Thanks @luddaniel and @Jérôme Roucou for putting in such effort to help.

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:12):

Some interesting ideas here!

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:13):

Please note that the update-fields script is already included in the configbaker image

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:15):

Instead of copying around the files it might be easier to mount the conf dir when using docker run with configbaker

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:16):

IIRC when using the dev persona the API is not locked down, so there should be no need to run the curl commands inside the app container

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:19):

If you would like to do this in production, you should take a look at the demo persona. The admin api will require an unblock key to be reachable from outside of the container

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:20):

Glad to see our upstream containers work for you! :smile:

view this post on Zulip luddaniel (Apr 11 2024 at 08:24):

Thanks @Oliver Bertuch I think it would be great to a couple of scripts to fastly configure a docker (adding language with package, adding/updating tsv, reload schema) as when we switch from PR to PR we kind of want to start from scratch and re-create a context

view this post on Zulip Aamir Muhammad (Apr 11 2024 at 08:25):

I am doing this on my local computer. I am very new to these technologies. we don't have a production instance. I am trying to work with external vocabulary and understand how it integrates to dataverse.

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:26):

Absolutely! That's why we have configbaker! :smile: Back in the day when we wrote our roadmap we already had on our radar to make the existing scripts more modular, so they can be easily reused as building blocks. For example we load a few metadata blocks by default, so this should be a small snippet, easy to run on its own as part of a customized bootstrapping or standalone execution

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:28):

At some point we were also talking about including a flavor that simply is an nginx to serve static files like the vocab scripts and previewers. That way you could simply serve all of the stuff in one go, also ready for further development

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:28):

Maybe you want to add this to the agenda of the container WG meeting later today? See #containers > weekly meeting

view this post on Zulip Oliver Bertuch (Apr 11 2024 at 08:29):

You or @Jérôme Roucou could even give a little demo and maybe we can talk about how this can be improved and generalised for further use?

view this post on Zulip luddaniel (Apr 11 2024 at 08:40):

I'll talk with the team ;)

view this post on Zulip Philip Durbin 🚀 (Apr 11 2024 at 10:42):

Sure, you're all very welcome to come to our weekly containerization working group meeting. :grinning: https://ct.gdcc.io

view this post on Zulip Philip Durbin 🚀 (Nov 07 2024 at 13:59):

I'm thinking about loading a metadata block and updating Solr over at #containers > update-fields.sh


Last updated: Nov 01 2025 at 14:11 UTC