containers for production · containers

Stream: containers

Topic: containers for production

Jutta Schnabel (Apr 23 2024 at 12:59):

Hi! Sorry for opening this up again, but I am again not sure where to put my question :) We have finally finished our evalution and want to deploy dataverse now in production - but ideally using containers. It there still a big "don't use this in production"-flag on it all? If yes, perhaps we can contribute to create a setup for production.

Philip Durbin 🚀 (Apr 23 2024 at 13:02):

@Jutta Schnabel hi! Great question. In our documentation, we are still being cautious about recommending containers for production but there are brave souls already using them!

Philip Durbin 🚀 (Apr 23 2024 at 13:03):

Security is top of mind for me. When working on the tutorial for using containers as a demo, I made sure to show how to run the setup-all.sh script WITHOUT the --insecureflag.

Philip Durbin 🚀 (Apr 23 2024 at 13:04):

Please see https://guides.dataverse.org/en/6.2/container/running/demo.html#creating-and-running-a-demo-persona for more on this.

Philip Durbin 🚀 (Apr 23 2024 at 13:04):

Expecially this:

"One of the main differences between the “dev” persona and our new “demo” persona is that we are now running the setup-all script without the --insecure flag. This makes our installation more secure, though it does block “admin” APIs that are useful for configuration."

Philip Durbin 🚀 (Apr 23 2024 at 13:07):

You could go to the equivalent of https://demo.dataverse.org/api/admin/index/solr/schema on your server and make sure you can't reach it from outside. If you can, please see https://guides.dataverse.org/en/6.2/installation/config.html#blocking-api-endpoints

Philip Durbin 🚀 (Apr 23 2024 at 13:12):

What do others think? What am I forgetting? What do we need for the images to be production-ready? More docs? More configurability?

Johannes D (Apr 23 2024 at 13:29):

Stable images tags like 6.2/6.3 are needed for me. Besides that I'm happy with the current containers and they work great in production :)

Philip Durbin 🚀 (Apr 23 2024 at 13:30):

That makes sense. Currently our tags are "alpha" and "unstable".

Philip Durbin 🚀 (Apr 23 2024 at 13:35):

We've definitely talked about tags in recent meetings (#containers > weekly meeting ). Probably the best issue to watch is #10478. Plus we have the following topics going:

Jutta Schnabel (Apr 23 2024 at 13:36):

Great, thanks, that sounds hopeful :) We will try that out and see that we manage.

Philip Durbin 🚀 (Apr 23 2024 at 13:39):

Sounds good. Please keep the feedback coming!

Johannes D (Apr 23 2024 at 13:41):

Besides that I assume the different projects and their means to create images https://github.com/gdcc/dataverse-kubernetes , https://github.com/EOSC-synergy/dataverse-kubernetes , https://github.com/IQSS/dataverse-docker and https://github.com/IQSS/dataverse/blob/develop/docker-compose-dev.yml are a bit confusing for someone new to the community. Hence, my wish-list contains a clean-up, or more detailed documentation or clarification about the projects and their relationship (they all look like official GDCC/IQSS projects)...

Philip Durbin 🚀 (Apr 23 2024 at 14:41):

@Johannes D good idea. I just created this issue: Document competing containerization efforts and how to choose #10522

Johannes D (Apr 23 2024 at 15:08):

Not entirely a container issue, but it would be nice to rework the documentation so that it is independent of the installation method. In particular, the admin and installation guides are in some cases quite specific to a particular installation method. This makes both demo/eval and production use cases more complex than they need to be.

Oliver Bertuch (Apr 23 2024 at 15:11):

#10478

Heads up I will be out all of May (as of next Monday) and will continue working on this in June

Oliver Bertuch (Apr 23 2024 at 15:13):

Johannes D said:

are a bit confusing for someone new to the community

I agree. I've been meaning to sunset gdcc/dataverse-kubernetes for some time now, but didn't have the time. A PR is welcome. (So don't delete but leave a hint in the README and archive it as read-only)

Johannes D (Apr 23 2024 at 15:13):

And documentation of the features that do not work out of the box in a Docker environment would be nice. Like RServe, make data count, or multiple dataverse instances. ... They work, but need a bit of tweaking....

Oliver Bertuch (Jul 15 2024 at 14:06):

Heads up that #10672 is ready for review! It will increase production usability :smiley:

Oliver Bertuch (Jul 15 2024 at 14:07):

@Philip Durbin are these container only things going trough the same process with sprint planning etc or can/should we fasttrack it?

Philip Durbin 🚀 (Jul 15 2024 at 14:09):

My rule of thumb is to look at the files change and to put it on the fast track the changes on affect containers. This one can be fast-tracked, I'd say. Please feel free to put it in "ready for review" if you like.

Oliver Bertuch (Jul 15 2024 at 14:12):

Done!

Oliver Bertuch (Jul 15 2024 at 14:13):

It's good to have this in place - we can expect Temurin images based on Ubuntu 24.04 to land within the next 4 to 5 weeks. Always good to be prepared!

Philip Durbin 🚀 (Jul 15 2024 at 14:13):

yeah

Philip Durbin 🚀 (Jul 15 2024 at 14:13):

I don't think anyone here can easily do this:

Suggestions on how to test this:
Run the images on a K8s cluster

Oliver Bertuch (Jul 15 2024 at 14:14):

Ha! I'll edit it to say just run them in Docker :smiling:

Philip Durbin 🚀 (Jul 15 2024 at 14:15):

Do you think you could help get https://github.com/gdcc/api-test-runner/blob/main/.github/workflows/manual.yml working again to test it?

Oliver Bertuch (Jul 15 2024 at 14:16):

Not sure - we're not talking Dataverse code here

Oliver Bertuch (Jul 15 2024 at 14:17):

The test is done once you successfully deploy - this is infrastructure, so the application doesn't matter

Oliver Bertuch (Jul 15 2024 at 14:17):

You could even try the base image with some other demo app

Philip Durbin 🚀 (Jul 15 2024 at 14:17):

Oh, even if it were working the "manual" workflow wouldn't test it?

Oliver Bertuch (Jul 15 2024 at 14:17):

I dunno if we should at some point include a minimal testing app for automation of base image accpetance tests

Oliver Bertuch (Jul 15 2024 at 14:18):

Well it tests that it builds (that is done in CI already), the one thing left to do is run an actual application...

Oliver Bertuch (Jul 15 2024 at 14:18):

That doesn't need to be Dataverse, which is huge and clunky

Philip Durbin 🚀 (Jul 15 2024 at 14:19):

Sure, sounds very useful.

Oliver Bertuch (Jul 15 2024 at 14:19):

Do you feel like it would be a good addition to this PR to have such smallscale tests around?

Philip Durbin 🚀 (Jul 15 2024 at 14:21):

Hmm, maybe? I mean, we want more testing of images before we publish them, generally.

Oliver Bertuch (Jul 15 2024 at 14:21):

True!

Oliver Bertuch (Jul 15 2024 at 14:21):

Do you feel we need to do this now or should we keep that for another issue/PR?

Philip Durbin 🚀 (Jul 15 2024 at 14:21):

Meh, I don't think we need it now.

Philip Durbin 🚀 (Jul 15 2024 at 14:22):

Some day I'd like to retire that api-test-runner repo and have the testing done upstream.

Oliver Bertuch (Jul 15 2024 at 14:22):

It might be a nice thing to try out https://github.com/arquillian/arquillian-testcontainers for this :smile_cat:

Philip Durbin 🚀 (Jul 15 2024 at 14:25):

Philip Durbin 🚀 (Jul 15 2024 at 19:41):

@Oliver Bertuch I removed you as an assignee from #10672. Items in "ready for review" should't have assignees so it's clear that anyone can pick them up and review them.

Philip Durbin 🚀 (Oct 02 2025 at 14:45):

I just made this pull request:

update docs to suggest using Docker in production #11862

You can preview the docs here: https://dataverse-guide--11862.org.readthedocs.build/en/11862/installation/prep.html#choose-your-own-installation-adventure

What do you think?

Johannes D (Oct 06 2025 at 09:06):

Since we are using those containers for production for the last couple of years, those images are ready for production usage...and the guide reflects it. However, one could add the information that 'make data count' does not work that well and a section about backup & restore in a docker environment is missing.

Philip Durbin 🚀 (Oct 16 2025 at 14:53):

Hmm. I don't feel very qualified to write about backup and restore. What if we allow #11862 to be reviewed and merged and make a PR in the future for that?

Philip Durbin 🚀 (Oct 16 2025 at 14:54):

I'm also not sure what I would write about Make Data Count. @Johannes D if you want to make a PR into my PR, please go ahead! :smile:

Philip Durbin 🚀 (Oct 16 2025 at 14:55):

Maybe I should try to address this as well:

Document competing containerization efforts and how to choose #10522

I do feel pretty qualified to write about that. :smile:

Oliver Bertuch (Nov 03 2025 at 15:06):

I think this would be a good addition for a production scenario. Not just in containers... :wink: #11948 CC @Leo Andreev

Philip Durbin 🚀 (Nov 03 2025 at 15:56):

yeah

Philip Durbin 🚀 (Nov 03 2025 at 15:57):

@Oliver Bertuch any feedback on the docs I wrote?

Philip Durbin 🚀 (Nov 18 2025 at 13:38):

#11862 has been merged! Docker in production! Thanks for reviewing and merging, @Steven Winship!

Last updated: Jan 09 2026 at 14:18 UTC