Stream: containers

Topic: demo tutorial issues


view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 17:03):

hi there .. i am attempting to follow the demo container install and am running into issues (of course .. it's me!) .. I launched an ec2 instance (ubuntu 22), installed docker, put the "ubuntu" user into the docker group, copied compose.yml, and ran "docker compose up" .. the installer exits with a bootstrap error (attached) ... and it looks like the solr container (and bootstrap) are exited .. the landing page looks like the attached ..
.. i tried both the "dev" and "demo" personas (with edits suggested) and same on both .. i tried stopping/removing all the containers and images and deleting the data directory and tried installing again a few times for both personas .. it's almost there, just not quite! :sweat_smile: .. any suggestions?

dv-docker-demo.jpg
dv-compose-error.jpg

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 17:13):

looking through docker logs for errors ..

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 17:57):

Hmm, can you please share your server.log file?

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 18:18):

here is payara logs text .. and also .. "docker logs bootstrap" errors ..

ServerLogs.txt
docker-bootstrap-errors.jpg

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 18:28):

Thanks, what does http://localhost:8080/api/info/version show?

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 18:32):

Nothing is jumping out at me.

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 18:44):

shows OK .. version 6.1

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 18:44):

okay, i'll keep messing with it .. thanks!

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 18:46):

Maybe it timed out. Maybe you need to bump the timeout.

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 18:47):

Please see https://preview.guides.gdcc.io/en/develop/container/running/demo.html#bootstrapping-did-not-complete

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 18:49):

that WAS my issue when I did this before :big_smile:

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 18:49):

i'll try it

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 19:45):

no luck .. set the timeout to 1m, 3m, 10m; tried increasing memory/CPU; tried running as root :woman_shrugging:
i do keep seeing the below bootstrap error at startup .. but the curl command to check version at command line is okay .. will keep trying things :smirk:
bootstrap-error.jpg

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 20:17):

Bah. Instead of a jpg, are you able to capture the entire output as a txt file?

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 20:52):

oops sorry wasn't getting all of the error output .. i think this has all of the startup output

StartupLogs.txt

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 20:52):

including those bootstrap errors

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 20:55):

weird the connection refused as i have everything open to port 8080 .. and it works from the command line

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 20:57):

that IP is a network device on the VPN .. i think?

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:00):

I'm seeing this: bootstrap | Waiting for http://dataverse:8080 to become ready in max 3m.

Have you tried 10 minutes?

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:01):

yes i did but i'll try it again

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:01):

Ok, thanks because I'm seeing bootstrap | 2024-03-12T20:47:14Z ERR Expectation failed error="timed out while making an http call, caused by: Get \"http://dataverse:8080/api/info/version\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" timeout=3s followed by Dataverse continuing to try to deploy.

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:03):

maybe i wrote the timeout block wrong .. it looks like this:

bootstrap:
container_name: "bootstrap"
image: gdcc/configbaker:alpha
restart: "no"
environment:

  - TIMEOUT=3m
command:

  - bootstrap.sh
  - dev

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:03):

from last time, i think maybe it needs to be TIMEOUT:3m right? agh

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:03):

or .. maybe i just ended up putting it in the command line

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:04):

The main thing is to check the output to make sure it changes to 10 or whatever.

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:06):

You're right, we DID talk about this before! Over here: https://dataverse.zulipchat.com/#narrow/stream/375812-containers/topic/running.20using.20compose/near/389735877

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:06):

Should be similar.

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:07):

yea i looked at that to get the format and made it just like that

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:07):

bah

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:07):

oh yea, and it is equals and not :

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:07):

Do you see 10 in the output?

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:16):

i still see 10s .. i think i need to format the yml file as an array like oliver suggested above in that thread .. trying that now

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 21:19):

Ok. If you figure it out, please consider making a pull request to improve https://github.com/IQSS/dataverse/blob/develop/doc/sphinx-guides/source/container/running/demo.rst !

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:23):

argh still didn't work .. still shows 10s in the log no matter how I format the .yml file .. will keep trying things .. somehow we got it working last time :big_smile:

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 21:24):

thanks for your help!

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 22:16):

Is 10 the default? I saw 3 in your output before. The main thing for the change you make to have an effect.

view this post on Zulip Deirdre Kirmis (Mar 12 2024 at 22:36):

yea i don't think my changes are having an effect .. I have tried changing the timeout for the dataverse container in compose.yml to 3m, 5m, and 10m but it doesn't seem to matter, the output error continues to reflect 10s .. I tried both adding the "environment" section with TIMEOUT=10m (above) and adding the values in "array" format as suggested in the previous thread but nothing seems to be reflected on the output error message and the site doesn't completely build .. it always shows "Timeout exceeded while awaiting headers)" timeout=3s" ... will keep trying things ..
.. i restored the image that I took of my original instance and looked at the settings and made them exactly the same .. no win

view this post on Zulip Philip Durbin πŸš€ (Mar 12 2024 at 23:58):

Bah. If you feel like it, you could create an issue saying we should document this better in that demo.rst file above.

view this post on Zulip Deirdre Kirmis (Mar 13 2024 at 00:51):

Oh man I can’t imagine this being better/easier :grinning_face_with_smiling_eyes: .. I just need to figure out what’s happening in my instance .. sorry if it sounded otherwise haha

view this post on Zulip Oliver Bertuch (Mar 13 2024 at 08:43):

What happens if you run this without compose? It should look like this:

❯ docker run -e TIMEOUT=10m gdcc/configbaker:unstable bootstrap.sh
Waiting for http://dataverse:8080 to become ready in max 10m.
2024-03-13T08:42:48Z INF [HTTP] Checking the http://dataverse:8080/api/info/version ...

view this post on Zulip Deirdre Kirmis (Mar 13 2024 at 15:05):

ah .. i thought there was a way to put the timeout in the command line! (Would help if I knew any docker commands) :big_smile:

ran that and just getting this error over and over:
2024-03-13T14:59:19Z INF [HTTP] Checking the http://dataverse:8080/api/info/version ...
2024-03-13T14:59:19Z ERR Expectation failed error="the status code doesn't expect" actual=403 expect=200

.. it creates a container called "recursing_kilby" .. will let it run and see what happens

view this post on Zulip Deirdre Kirmis (Mar 13 2024 at 15:19):

yea, just fails with "Error: context deadline exceeded" error

view this post on Zulip Oliver Bertuch (Mar 13 2024 at 15:51):

What is the first line of output there?

view this post on Zulip Oliver Bertuch (Mar 13 2024 at 15:52):

You might notice in my snippet there is the timeout properly set to 10m and not to 10s.

view this post on Zulip Deirdre Kirmis (Mar 13 2024 at 16:52):

yes i copied your command exactly as it is .. the first line of output says "Waiting for http://dataverse:8080 to become ready in max 10m."

view this post on Zulip Deirdre Kirmis (Mar 13 2024 at 16:52):

docker-command-line-timeout.jpg

view this post on Zulip Philip Durbin πŸš€ (Mar 21 2024 at 20:10):

@Deirdre Kirmis I just made a pull request after doing a little testing. Please check it out: clarify how to increase timeout in docker demo #10410

view this post on Zulip Deirdre Kirmis (Mar 21 2024 at 21:40):

Looks great! I do have that set in my compose.yml .. but in the bootstrap log the first line of output shows the "waiting to become ready in max 10m" whereas further down there is an error "context deadline exceeded timeout=3s" so somewhere that setting is getting lost ..

..and the result is this.. http://dataverse-docker-qa.lib.asu.edu:8080/

Also, I have to remove the containers and images each time I reload because otherwise the site doesn't load at all ...

..haven't had time to work on this much the last few days, but will continue to troubleshoot as I can :woman_shrugging:

.. sorry for jumping in/out of the container WG meeting this morning .. too many super smart people in one place! :big_smile:

view this post on Zulip Deirdre Kirmis (Mar 21 2024 at 21:40):

timeout-set.JPG

view this post on Zulip Deirdre Kirmis (Mar 21 2024 at 21:42):

bootstrap.log.txt

view this post on Zulip Deirdre Kirmis (Mar 21 2024 at 21:44):

i did try installing on virtualbox and my mac and get the same issue .. i restored my image of the instance of the docker version that I had working a few months ago and it is doing the same thing now too .. weird

view this post on Zulip Philip Durbin πŸš€ (Mar 21 2024 at 21:47):

Ok, so it sounds like you're seeing "10m" in the output. The change is in place. But all the time in the world isn't going to help! It's something else, right?

view this post on Zulip Deirdre Kirmis (Mar 21 2024 at 21:52):

well there is another error farther down indicating a 3s timeout .. I thought you had said that was related and may be affecting the bootstrap process .. not sure what is happening :sweat_smile:

view this post on Zulip Deirdre Kirmis (Mar 21 2024 at 21:55):

but yea, seems something else is likely going on .. this is an instance on a public subnet (ie: not behind a LB) .. i tried adding a security group rule allowing all traffic in case it was a port issue (not a good idea i know) .. but it is weird that my container instance that worked a couple months ago (ie: the site came up and I was able to login) now gives the same errors ..

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 06:55):

Is this maybe a similar error as in #containers > curl DNS broken ?

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 11:06):

Maybe :thinking:

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 14:27):

yes, seems the same .. my instance is an ubuntu 22.04 with 8G RAM and 2 CPUs .. the behavior is the same, as I get the repeating message in the bootstrap log with 404 error .. I am using an instance that is on a public subnet with our DNS server pointing to the instance public URL .. going to the site shows the main dataverse landing with a "page not found" error

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 14:29):

"I did a bootstrap manually now by circumventing the DNS lookup (provided the Dataverse URL as command argument)"

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 14:30):

@Deirdre Kirmis are you interested in creating an issue for this problem? It sounds like we could use some more docs, at least.

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 14:31):

okay, sure will do

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 14:31):

how would i "do the bootstrap manually" to see if that works?

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 14:34):

If you run docker run -it --rm gdcc/configbaker:unstable bootstrap.sh -h you should see help output like this:

Usage: bootstrap.sh [-h] [-u instanceUrl] [-t timeout] [-e targetEnvFile] [<persona>]

Execute initial configuration (bootstrapping) of an empty Dataverse instance.
Known personas: dev base demo

Parameters:
  instanceUrl - Location on container network where to reach your instance. Default: 'http://dataverse:8080'
      timeout - Provide how long to wait for the instance to become available (using wait4x). Default: '2m'
targetEnvFile - Path to a file where the bootstrap process can expose information as env vars (e.g. dataverseAdmin's API token)
      persona - Configure persona to execute. Calls /scripts/bootstrap/<persona>/init.sh. Default: 'base'

Note: This script will wait for the Dataverse instance to be available before executing the bootstrapping.
      It also checks if already bootstrapped before (availability of metadata blocks) and skip if true.

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 14:34):

... and I assume you'd use -u instanceUrl

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 14:35):

Not that I've ever done it! :sweat_smile:

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 14:40):

ah okay .. was just trying to see if I could set it somewhere in compose.yml .. will try it

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 14:42):

Good luck! :shamrock: Maybe @Jutta Schnabel knows the magic. :magic_wand:

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 14:42):

Yes you can! But it's hard, because you will need to have the internal IP of the container.

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 14:42):

So I'd advice doing the following:

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 14:44):

Execute docker run --rm -it --network <network> bash. The <network> is the one that compose creates for you, name is visible in docker network list

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 14:46):

Once you have that shell, run the job, using bootstrap.sh -u <DV-container-IP> dev or whatever persona you want to use (which needs to be available to the container, so you might need to mount it into it before running when you want a custom one)

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 14:48):

If you don't want to hop into the DV container to get the IP, you can get that information from docker network inspect <network name>

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 14:57):

@Philip Durbin I added notes and ideas in the other topic

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 15:07):

I saw! Thanks for that and all of the above! :dataverse_man:

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:20):

^ working on this .. i found the dataverse container IP using "docker network inspect <network name>" .. and docker inspect dataverse gives me the same IP address (as long as the container is running) .. so then run "docker run -e TIMEOUT=10m gdcc/configbaker:unstable bootstrap.sh -u <IP address> dev" ?

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 15:21):

@Deirdre Kirmis if you'd like to follow along: #containers > curl DNS broken

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:22):

@Deirdre Kirmis if you wait some more, I can give you a different image from GHCR.io for you to test (basically we need to wait until we have the notification over at https://github.com/IQSS/dataverse/pull/10414)

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:25):

okay, but if I were to manually do the bootstrap step (just for a learning experience :big_smile: ) should that command above work? It is giving me errors so I likely have it wrong.

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:25):

Which one of those... I typed these from my head, so might have done a typo

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:26):

Oh wait I see now

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:27):

You're not following the instructions... :see_no_evil: :stuck_out_tongue: :man_teacher:

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:27):

typical

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:27):

Please note that you must attach the container to the right network - otherwise Dataverse will not be reachable

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:28):

You can go for your docker run command, but you need to add the network bit then

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:32):

@Deirdre Kirmis can you try with this image plz? ghcr.io/gdcc/configbaker:10413-configbaker-alpine-downgrade :fingers_crossed:

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:38):

trying now ..

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 15:38):

:fingers_crossed:

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:44):

i got the same results, although I likely did something wrong .. i am using the demo tutorial instructions, so i edited compose.yml and replaced with the alpine image URL above and re-ran "docker compose up" .. should I try the docker run command and/or clone the repo and run that way?

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 15:45):

Can you please show us a diff?

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:45):

OK we are getting to a point where it probably is much easier going for a Zoom call

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:45):

Some live action on screen is probably making debugging easier

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:46):

it would help, too, if I had any idea what i'm doing :smile:

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:46):

We're all here to learn

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:46):

some more than others haha

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:49):

@Deirdre Kirmis do you want some live debugging action on Zoom?

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 15:51):

sure, if you have time, but don't want to be a pain .. if it fixes an issue? :big_smile:

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 15:51):

https://fz-juelich-de.zoom.us/j/63165889221?pwd=MlBvL0V0WGM3N2gvamlqcGJuZVF5Zz09

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 16:21):

turns out i just put the image name in the wrong spot :sweat_smile: .. it is working! thank you @Oliver Bertuch

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 16:23):

Alright we figured out:

  1. @Deirdre Kirmis used the alpha tagged image (as we told her in the demo), which dates before Christmas and does not have the backport.
  2. We tried using gdcc/configbaker:unstable, which contains the fix indeed, BUT there is a gotcha here: the resolution used the search parameter we get from resolv.conf and thus made curl go for dataverse.asu.edu and an external IP, not the internal one.
  3. We tried out the GHCR image with the 3.18 downgrade and it worked like a breeze!

Here's a screenshot of what happens at 2)
image.png

view this post on Zulip Oliver Bertuch (Mar 22 2024 at 16:25):

Thanks for being my guinea pig today @Deirdre Kirmis ! Much appreciated you took the time to dig through this!

view this post on Zulip Deirdre Kirmis (Mar 22 2024 at 16:33):

Happy to help any time! :smile: Thanks for teaching me some docker things!

view this post on Zulip Philip Durbin πŸš€ (Mar 22 2024 at 16:58):

Good job, team! :tada:


Last updated: Oct 30 2025 at 05:14 UTC