Hi! Sorry for opening this up again, but I am again not sure where to put my question :) We have finally finished our evalution and want to deploy dataverse now in production - but ideally using containers. It there still a big "don't use this in production"-flag on it all? If yes, perhaps we can contribute to create a setup for production.
@Jutta Schnabel hi! Great question. In our documentation, we are still being cautious about recommending containers for production but there are brave souls already using them!
Security is top of mind for me. When working on the tutorial for using containers as a demo, I made sure to show how to run the setup-all.sh script WITHOUT the --insecureflag.
Please see https://guides.dataverse.org/en/6.2/container/running/demo.html#creating-and-running-a-demo-persona for more on this.
Expecially this:
"One of the main differences between the βdevβ persona and our new βdemoβ persona is that we are now running the setup-all script without the --insecure flag. This makes our installation more secure, though it does block βadminβ APIs that are useful for configuration."
You could go to the equivalent of https://demo.dataverse.org/api/admin/index/solr/schema on your server and make sure you can't reach it from outside. If you can, please see https://guides.dataverse.org/en/6.2/installation/config.html#blocking-api-endpoints
What do others think? What am I forgetting? What do we need for the images to be production-ready? More docs? More configurability?
Stable images tags like 6.2/6.3 are needed for me. Besides that I'm happy with the current containers and they work great in production :)
That makes sense. Currently our tags are "alpha" and "unstable".
We've definitely talked about tags in recent meetings (#containers > weekly meeting ). Probably the best issue to watch is #10478. Plus we have the following topics going:
Great, thanks, that sounds hopeful :) We will try that out and see that we manage.
Sounds good. Please keep the feedback coming!
Besides that I assume the different projects and their means to create images https://github.com/gdcc/dataverse-kubernetes , https://github.com/EOSC-synergy/dataverse-kubernetes , https://github.com/IQSS/dataverse-docker and https://github.com/IQSS/dataverse/blob/develop/docker-compose-dev.yml are a bit confusing for someone new to the community. Hence, my wish-list contains a clean-up, or more detailed documentation or clarification about the projects and their relationship (they all look like official GDCC/IQSS projects)...
@Johannes D good idea. I just created this issue: Document competing containerization efforts and how to choose #10522
Not entirely a container issue, but it would be nice to rework the documentation so that it is independent of the installation method. In particular, the admin and installation guides are in some cases quite specific to a particular installation method. This makes both demo/eval and production use cases more complex than they need to be.
Heads up I will be out all of May (as of next Monday) and will continue working on this in June
Johannes D said:
are a bit confusing for someone new to the community
I agree. I've been meaning to sunset gdcc/dataverse-kubernetes for some time now, but didn't have the time. A PR is welcome. (So don't delete but leave a hint in the README and archive it as read-only)
And documentation of the features that do not work out of the box in a Docker environment would be nice. Like RServe, make data count, or multiple dataverse instances. ... They work, but need a bit of tweaking....
Heads up that #10672 is ready for review! It will increase production usability :smiley:
@Philip Durbin are these container only things going trough the same process with sprint planning etc or can/should we fasttrack it?
My rule of thumb is to look at the files change and to put it on the fast track the changes on affect containers. This one can be fast-tracked, I'd say. Please feel free to put it in "ready for review" if you like.
Done!
It's good to have this in place - we can expect Temurin images based on Ubuntu 24.04 to land within the next 4 to 5 weeks. Always good to be prepared!
yeah
I don't think anyone here can easily do this:
Suggestions on how to test this:
Run the images on a K8s cluster
Ha! I'll edit it to say just run them in Docker :smiling:
Do you think you could help get https://github.com/gdcc/api-test-runner/blob/main/.github/workflows/manual.yml working again to test it?
Not sure - we're not talking Dataverse code here
The test is done once you successfully deploy - this is infrastructure, so the application doesn't matter
You could even try the base image with some other demo app
Oh, even if it were working the "manual" workflow wouldn't test it?
I dunno if we should at some point include a minimal testing app for automation of base image accpetance tests
Well it tests that it builds (that is done in CI already), the one thing left to do is run an actual application...
That doesn't need to be Dataverse, which is huge and clunky
Sure, sounds very useful.
Do you feel like it would be a good addition to this PR to have such smallscale tests around?
Hmm, maybe? I mean, we want more testing of images before we publish them, generally.
True!
Do you feel we need to do this now or should we keep that for another issue/PR?
Meh, I don't think we need it now.
Some day I'd like to retire that api-test-runner repo and have the testing done upstream.
It might be a nice thing to try out https://github.com/arquillian/arquillian-testcontainers for this :smile_cat:
Related: #containers > failing tests in 6.3 from api-test-runner
@Oliver Bertuch I removed you as an assignee from #10672. Items in "ready for review" should't have assignees so it's clear that anyone can pick them up and review them.
I just made this pull request:
update docs to suggest using Docker in production #11862
You can preview the docs here: https://dataverse-guide--11862.org.readthedocs.build/en/11862/installation/prep.html#choose-your-own-installation-adventure
What do you think?
Since we are using those containers for production for the last couple of years, those images are ready for production usage...and the guide reflects it. However, one could add the information that 'make data count' does not work that well and a section about backup & restore in a docker environment is missing.
Hmm. I don't feel very qualified to write about backup and restore. What if we allow #11862 to be reviewed and merged and make a PR in the future for that?
I'm also not sure what I would write about Make Data Count. @Johannes D if you want to make a PR into my PR, please go ahead! :smile:
Maybe I should try to address this as well:
Document competing containerization efforts and how to chooseΒ #10522
I do feel pretty qualified to write about that. :smile:
Last updated: Oct 30 2025 at 05:14 UTC