Run dataverse-external-vocab-support demo under docker · dev

Hello :) I have been contacted by @Aamir Muhammad on CVOC subject and he would like to get a first experience under docker. Thought it would be great to have a basic starter kit, so here is a procedure to enable demo files of https://github.com/gdcc/dataverse-external-vocab-support project :

# usual docker start
mvn -Pct clean package docker:run

# add cvocdemo.tsv
wget https://raw.githubusercontent.com/gdcc/dataverse-external-vocab-support/main/examples/metadatablocks/cvocdemo.tsv -O /tmp/cvocdemo.tsv
docker cp /tmp/cvocdemo.tsv dataverse-1:/opt/payara/file.tsv
docker exec dataverse-1 curl http://localhost:8080/api/admin/datasetfield/load -H "Content-type: text/tab-separated-values" -X POST --data-binary @/opt/payara/file.tsv

# http://localhost:8080/dataverseuser.xhtml?selectTab=apiTokenTab to get API TOKEN
curl -H "X-Dataverse-key:$API_TOKEN" -X POST -H "Content-type:application/json" -d "[\"citation\",\"extVocabDemoMetadata\"]" http://localhost:8080/api/dataverses/:root/metadatablocks

# update solr schema via dataverse api
docker exec dataverse-1 wget https://raw.githubusercontent.com/IQSS/dataverse/develop/conf/solr/9.3.0/update-fields.sh -O /opt/payara/update-fields.sh
docker exec dataverse-1 wget https://raw.githubusercontent.com/IQSS/dataverse/develop/conf/solr/9.3.0/schema.xml -O /opt/payara/schema.xml
docker exec dataverse-1 chmod u+x /opt/payara/update-fields.sh
docker exec dataverse-1 wget http://localhost:8080/api/admin/index/solr/schema -O /opt/payara/partial-schema.xml
docker exec --user root dataverse-1 apt install bc ed
docker exec dataverse-1 /opt/payara/update-fields.sh /opt/payara/schema.xml /opt/payara/partial-schema.xml

# reload solr with schema
docker cp dataverse-1:/opt/payara/schema.xml /tmp/schema.xml
docker cp /tmp/schema.xml solr-1:/var/solr/data/collection1/conf/
docker exec solr-1 curl "http://localhost:8983/solr/admin/cores?action=RELOAD&core=collection1"

# deploy demo cvoc conf
wget https://raw.githubusercontent.com/gdcc/dataverse-external-vocab-support/main/examples/config/cvoc-conf.json -O /tmp/cvoc-conf.json
curl -X PUT --upload-file /tmp/cvoc-conf.json http://localhost:8080/api/admin/settings/:CVocConf

Aamir Muhammad (Apr 11 2024 at 08:11):

Oliver Bertuch (Apr 11 2024 at 08:12):

Oliver Bertuch (Apr 11 2024 at 08:13):

Please note that the update-fields script is already included in the configbaker image

Oliver Bertuch (Apr 11 2024 at 08:15):

Instead of copying around the files it might be easier to mount the conf dir when using docker run with configbaker

Oliver Bertuch (Apr 11 2024 at 08:16):

IIRC when using the dev persona the API is not locked down, so there should be no need to run the curl commands inside the app container

Oliver Bertuch (Apr 11 2024 at 08:19):

If you would like to do this in production, you should take a look at the demo persona. The admin api will require an unblock key to be reachable from outside of the container

Oliver Bertuch (Apr 11 2024 at 08:20):

luddaniel (Apr 11 2024 at 08:24):

Thanks @Oliver Bertuch I think it would be great to a couple of scripts to fastly configure a docker (adding language with package, adding/updating tsv, reload schema) as when we switch from PR to PR we kind of want to start from scratch and re-create a context

Aamir Muhammad (Apr 11 2024 at 08:25):

I am doing this on my local computer. I am very new to these technologies. we don't have a production instance. I am trying to work with external vocabulary and understand how it integrates to dataverse.

Oliver Bertuch (Apr 11 2024 at 08:26):

Absolutely! That's why we have configbaker! :smile: Back in the day when we wrote our roadmap we already had on our radar to make the existing scripts more modular, so they can be easily reused as building blocks. For example we load a few metadata blocks by default, so this should be a small snippet, easy to run on its own as part of a customized bootstrapping or standalone execution

Oliver Bertuch (Apr 11 2024 at 08:28):

At some point we were also talking about including a flavor that simply is an nginx to serve static files like the vocab scripts and previewers. That way you could simply serve all of the stuff in one go, also ready for further development

Oliver Bertuch (Apr 11 2024 at 08:28):

Maybe you want to add this to the agenda of the container WG meeting later today? See #containers > weekly meeting

Oliver Bertuch (Apr 11 2024 at 08:29):

You or @Jérôme Roucou could even give a little demo and maybe we can talk about how this can be improved and generalised for further use?

luddaniel (Apr 11 2024 at 08:40):

Philip Durbin 🚀 (Apr 11 2024 at 10:42):

Sure, you're all very welcome to come to our weekly containerization working group meeting. :grinning: https://ct.gdcc.io