Being a newbie to the Dataverse code, it seems that Payara or some other service is configuring the default postgres database, is that correct?
I am trying alter the database when Docker is setting up the postgres container but it seems the database may already be in use by the time Docker compose tries to attach the volume. For instance I have tried telling Docker compose to execute SQL on initialization of the container (such as running SQL scripts from within my local /postgres/apiCleanCreate directory for instance):
volumes:
- ./postgres/apiCleanCreate:/docker-entrypoint-initdb.d
I have tried using data-only database dumps and full create/clean database dumps, but they all fail since the database is already open. I'm guessing other services may be already using the database at this point, so I am trying to get some ideas as to how to achieve a database alteration on initialization. Any thoughts?
I don't know if this helps or not, but here's an example or running a SQL script after the containers have been started: https://github.com/gdcc/api-test-runner/blob/5ee081923cf5f53d2f27b5f6918925ec6ebbecef/.github/workflows/develop.yml#L58 (It's necessary to get certain API tests to pass.)
It seems like you're operating at an earlier and lower level, though?
Hmm. That script does not alter any existing tables or data so I'm not sure if it would work.
Sure, but one should be able to run a SQL script against a running database. psql -f update.sql or whatever.
Typically we do this stuff in Flyway these days, but it should work.
Let me head back to the drawing board. I guess I need to figure out if I can use insert statements or if I need to upsert everything, and where foreign-keys will be a problem.
Ok. I hope I'm not leading you astray. What does your SQL script do? What tables does it touch?
I guess, I'm thinking.... let's get the script working with psql -f first and then figure out where to put it longer term.
Question, when I dump the database (using pg_dump --inserts...) that Docker throws together, I get statements like INSERT INTO public.dvobject . However when I try t recompose a fresh build Docker using the data dumped (insert statements), Docker says public.dvobject does not exist.
I'm probably confused, but wouldn't you restore with something like psql mydb -f dump.sql?
I'm not sure what recompose means in this context.
You've taken a dump of your database. Now you want to restore that dump to a fresh database? In our Docker setup?
Sorry, I am trying to use the Docker volume insert point within the Docker compose script (if possible). By recompose I mean clear out/delete the existing Docker containers and volumes and initiate a fresh compose. So at the moment I am not using the postgres restore command. My last test was to simply execute some insert statements (again through the compose volumes command) but for some reason the compose logs say public.dvobject (from the postgres insert statement) does not exist. I guess part of the problem is I do not know where the postgres database gets the default configuration from or when that executes in the Docker compose process. I'm assuming some other service (Payara?) is configuring the database?
I'm not sure how it works in our containers, off the top of my head. In a classic installation, our Python script creates the database.
In our containers I'm pretty sure there's a fair amount of magic going on. MicroProfile Config stuff. I'm not exactly sure when or how the database gets created.
I have tried running database restore commands after the Dataverse bootstrapping is complete but they all fail. Since other services are attached to the database I don't believe the restore commands will work. Ideally this would happen within the postgres container before any other services attach to postgres. I'm not sure how I could go about decoupling the database setup from the bootstrapping so the database can be setup entirely within the postgres service.
Would it be easy for me to reproduce what you're seeing? Would you like to provide scripts or specific commands?
Thanks @Philip Durbin, I don't think it is worth it to try to duplicate the approach I was trying because it would not seem productive. Again, the database needs to be configured before services attach to it. The most useful thing would be to know is how the database is being created and configured by the bootstrapping service (or whichever service is handling it).
Ok. Let's see if @Oliver Bertuch can offer some insight into the Docker side.
Philip Durbin said:
In our containers I'm pretty sure there's a fair amount of magic going on. MicroProfile Config stuff. I'm not exactly sure when or how the database gets created.
No magic involved... A classic installation is using the same MPCONFIG things as our container... JPA is the one creating the tables if not existing. The underlying database is created by the Docker entrypoint scripts of the postgres image (names are taken from the provided env vars). See also "Environment Variables" at https://hub.docker.com/_/postgres
Last updated: Oct 30 2025 at 05:14 UTC