Upgrading 5.14 to 6.2 · troubleshooting

I have been trying to upgrade my Dataverse installation, from 5.14 to 6.2, but I have had a problem deploying the dataverse war file. The problem is with postgres.

I have created a new database, but the problem continues. There is an image showing the problem.

Santiago Florez (Jun 05 2024 at 16:16):

Philip Durbin 🚀 (Jun 05 2024 at 18:07):

Santiago Florez (Jun 05 2024 at 18:17):

Santiago Florez (Jun 05 2024 at 18:22):

Philip Durbin 🚀 (Jun 05 2024 at 18:29):

sudo -u dataverse /usr/local/payara6/bin/asadmin create-jvm-options --add-opens=java.base/java.io=ALL-UNNAMED

I'm asking because in your log I see module java.base does not "opens java.io" to unnamed module and I'm reminded of #10068.

Santiago Florez (Jun 06 2024 at 15:59):

Hi @Philip Durbin I have decided to install the new dataverse version from scratch. Thank goodness I didn't have any information loaded yet.

Philip Durbin 🚀 (Jun 06 2024 at 18:00):

Sherry Lake (Jun 07 2024 at 12:09):

I noticed Harvard is at 6.2, but this version: v. 6.2 build "v6.2+10451+10463+10383-iqss"

UVa is upgrading from 5.14 to 6.2 next week and am wondering about running into problems. There seem to be a few folks have encountered (from the Dataverse google group and here on Zulip).

Philip Durbin 🚀 (Jun 07 2024 at 13:25):

I guess these are all fixes we needed for Harvard Dataverse. They will all be included in 6.3.

jamie jamison (Jun 13 2024 at 00:44):

I'm finally able to list buckets via aws cli. The production bucket policy doesn't have an explicate line for listing the bucket and doesn't seem to need it. But I aded the explict list ("Action": "s3:ListBucket",) to dataverse-test-oregon and suddenly I could list it with aws cli.
I'm not sure what the difference is but will do this for the new production system.
Back to shibboleth and doi setup correctly in domain.xml.

jamie jamison (Jun 13 2024 at 17:24):

s3 bucket issues. The production bucket policy (RHEL 7) doesn't have an explicate line for listing the bucket and doesn't seem to need it. But I aded the explict list ("Action": "s3:ListBucket",) to dataverse-test-oregon and suddenly I could list it with aws cli. I don't know if this is a difference between the 5.14 and 6.2 but I'm going to follow this example for the production 6.2.

Philip Durbin 🚀 (Jun 13 2024 at 17:28):

jamie jamison (Jun 13 2024 at 17:33):

I haven't installed a clean system since 4.11 or there about. It seems that what worked up to 5.14 is different in 6.2. Bit of a jump between versions so I may have missed something. Examples for the pid/doi settings might be helpful. I'll post mine when I get through this install.

jamie jamison (Jun 13 2024 at 17:35):

Sorry, I just realized I posted in the wrong place. I'm limping through 5.14 to 6.2

jamie jamison (Jun 13 2024 at 17:36):

My new errors are actually helpful. Things I need to correct in domain.xml.
[#|2024-06-11T02:46:25.683+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985683;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/QF48PD (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/MN0MUZ (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-06-11T02:46:25.693+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985693;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/KHOWY6 (getDataAccessObject: Ulogs/server.log

Philip Durbin 🚀 (Jun 13 2024 at 17:41):

jamie jamison (Jun 13 2024 at 17:48):

I'm not sure what some of the domain.xml settings should be so I'm working it out via trail and error.

jamie jamison (Jun 13 2024 at 18:03):

Philip Durbin 🚀 (Jun 13 2024 at 18:04):

jamie jamison (Jun 13 2024 at 18:11):

[#|2024-06-11T02:46:25.683+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985683;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/QF48PD (getDataAccessObject: Unsupported storage method.)|#]

Don Sizemore (Jun 13 2024 at 18:20):

@Philip Durbin @jamie jamison Dataverse doesn't need listBucket but the CLI does, if you want to pull a listing.

jamie jamison (Jun 13 2024 at 18:23):

Perhaps inaccurately I was using aws cli to check if the buckets were accessable.
Right now I'm stuck at the point where payara6 is running but dataverse pages aren't loading.

jamie jamison (Jun 13 2024 at 22:35):

I managed to break the test system to the point it no longer loads. Since I thought it was idempotent I tried running the ansible script again. I stopped and disabled payara6, ran the script and got this message:
ASK [dataverse : fire off installer] **********************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "/usr/bin/python3 /tmp/dvinstall/install.py -f --config_file=default.config --noninteractive > /tmp/dvinstall/install.out 2>&1", "delta": "0:01:26.506903", "end": "2024-06-13 22:32:09.208629", "msg": "non-zero return code", "rc": 1, "start": "2024-06-13 22:30:42.701726", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

So I realize now it's not idempotent. I can rebuild from scratch if necessary, had some practice. But I'm wondering if dropping the dataverse database table would help?

jamie jamison (Jun 14 2024 at 00:13):

To rerun the ansible script: 1) stop and disable payara6, 2) drop (and probably dump) the dvndb, 3) probably doesn't hurt to delete the domain.xml file and then rerun the ansible script.

Philip Durbin 🚀 (Jun 14 2024 at 01:42):

jamie jamison (Jun 14 2024 at 04:28):

jamie jamison (Jun 14 2024 at 18:27):

Anyone out there with postgres experience?
I was able to reload the dvndb database but I have to set (or reset) the privlidges. I 'm trying to figure out what '=Tc/postgres postgres=CTc/postgres dvnuser=CTc/postgres ' so I can add that to dvndb.

Philip Durbin 🚀 (Jun 14 2024 at 18:30):

jamie jamison (Jun 14 2024 at 18:30):

jamie jamison (Jun 14 2024 at 18:31):

jamie jamison (Jun 14 2024 at 18:32):

jamie jamison (Jun 14 2024 at 18:35):

After running ansible it seems you have do drop or rename that initial dvndb, make a new dvndb and then reload, and the add privlidges. (By the way did I mention that I lost my sys admin support a week before RHEL7 end-of-life, hense the frantic googling to rebuild on a new system.)

Philip Durbin 🚀 (Jun 14 2024 at 18:37):

 dvndb      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 dvndb_bkup | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =Tc/postgres         +
            |          |          |         |         | postgres=CTc/postgres+
            |          |          |         |         | dvnuser=CTc/postgres
 dvndp      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 postgres   | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 template0  | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =c/postgres          +
            |          |          |         |         | postgres=CTc/postgres
 template1  | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =c/postgres          +
            |          |          |         |         | postgres=CTc/postgres
(6 rows)

I'm using fenced code blocks above. Very useful here in Zulip and in GitHub. You could try this if you want:

```
test
```

Philip Durbin 🚀 (Jun 14 2024 at 18:39):

Philip Durbin 🚀 (Jun 14 2024 at 18:41):

Don Sizemore (Jun 14 2024 at 18:50):

@jamie jamison IIRC postgres ownership and privileges may need to be modified table-by-table

Don Sizemore (Jun 14 2024 at 18:50):

@jamie jamison if you dump and import as Phil suggests you can use the -O (no "owner") flag which may greatly simplify the process?

jamie jamison (Jun 14 2024 at 18:51):

That makes rebuilding the database a bit more daunting I'll go back and try that - dump from the previous test system and bring over to new.

jamie jamison (Jun 14 2024 at 19:58):

Dropped the database and reloaded from a dump. It now matches up to what the old server postgres looked like but still not getting the dataverse front page. May take waiting longer

Philip Durbin 🚀 (Jun 14 2024 at 20:00):

jamie jamison (Jun 14 2024 at 20:02):

jamie jamison (Jun 14 2024 at 20:04):

Philip Durbin 🚀 (Jun 14 2024 at 20:06):

jamie jamison (Jun 14 2024 at 20:22):

jamie jamison (Jun 14 2024 at 20:24):

[#|2024-06-14T00:19:37.128+0000|SEVERE|Payara 6.2023.8|edu.harvard.iq.dataverse.mydata.DataRetrieverAPI|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(5);_TimeMillis=1718324377128;_LevelValue=1000;|
Sorry, nothing was found for these roles: Admin, File Downloader, Dataverse + Dataset Creator, Dataverse Creator, Dataset Creator, Contributor, Curator, Member|#]

jamie jamison (Jun 14 2024 at 20:24):

Philip Durbin 🚀 (Jun 14 2024 at 20:35):

jamie jamison (Jun 14 2024 at 20:35):

Philip Durbin 🚀 (Jun 14 2024 at 20:41):

10 years ago according to git. But something like that should work, I would think.

jamie jamison (Jun 14 2024 at 20:42):

Philip Durbin 🚀 (Jun 14 2024 at 20:44):

jamie jamison (Jun 14 2024 at 20:49):

Ok, this is probably not an official way to fix things but after reloading the database I reran ansible and now the test looks ok.
At that point I did as @Philip Durbin suggested and ran with "/api/info/version" and got status 'ok' and version '6.2'

jamie jamison (Jun 14 2024 at 20:49):

jamie jamison (Jun 17 2024 at 17:18):

Example :
<jvm-options>-Ddataverse.files.storage-driver-id=file</jvm-options>
<jvm-options>-Xmx3841m</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>

If I restart and check status of payar6 I get:
sudo systemctl status payara6
× payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Mon 2024-06-17 17:07:10 UTC; 8min ago
Process: 43806 ExecStart=/usr/bin/java -jar /usr/local/payara6/glassfish/lib/client/appserver-cli.jar start-domain (code=exited, status=1/FAILURE)
CPU: 2.396s

Jun 17 17:07:08 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal java[43806]: Port 7676 is in use
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal java[43806]: Command start-domain failed.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, status=1/FAILURE
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'exit-code'.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 2.396s CPU time.
[rocky@ip-172-31-23-165 ~]$

And, oddly the test system looks like it's up, though with no buckets until I reboot the system then I get a 500

jamie jamison (Jun 17 2024 at 19:41):

And another question. When it's up I'm not able to log into the test dataverse with the default admin account. In the ansible script it looks like the admin account is 'dataverseAdmin' and thepassword is "admin1" which gets the error "username, email address, or password you entered is invalid."

jamie jamison (Jun 17 2024 at 19:57):

Don Sizemore (Jun 18 2024 at 14:40):

is there anything more descriptive in the Payara server.log?
on Shibboleth: worst case you can set gpg_enable=0 in the repo config, but better to import the GPG key

jamie jamison (Jun 18 2024 at 15:31):

For the payara server log part of the problem is I'm not sure what to look for. I'm partially relying on the time stamp.
For shib I'm obviously not importing the GPG correctly. I have back-to-back patron consults till noon and then will dig back in.

Don Sizemore (Jun 18 2024 at 18:44):

gpgkey=https://shibboleth.net/downloads/service-provider/RPMS/repomd.xml.key
https://shibboleth.net/downloads/service-provider/RPMS/cantor.repomd.xml.key

jamie jamison (Jun 18 2024 at 19:04):

Please report any problems to https://shibboleth.atlassian.net/jira

jamie jamison (Jun 24 2024 at 18:17):

One thing I am running into is that out-of-the-box I can't log into Dataverse. I've tried to unblock admin in the main.yml script but that didn't help.
Is there a way via the api to change the dataverseAdmin password? I'm reading through the documenation but havent found that yet

Don Sizemore (Jun 24 2024 at 19:48):

jamie jamison (Jun 24 2024 at 20:09):

Ok, I'll look into that. I did try setting the dataverseAdmin password in the ansible script.

jakarta.servlet.ServletException: /dataset.xhtml @595,164 rendered="#{settingsWrapper.makeDataCountDisplayEnabled and DatasetPage.doi}": The class 'edu.harvard.iq.dataverse.DatasetPage' does not have the property 'doi'.
root cause

jakarta.el.PropertyNotFoundException: /dataset.xhtml @595,164 rendered="#{settingsWrapper.makeDataCountDisplayEnabled and DatasetPage.doi}": The class 'edu.harvard.iq.dataverse.DatasetPage' does not have the property 'doi'.

Don Sizemore (Jun 25 2024 at 14:52):

jamie jamison (Jun 25 2024 at 16:26):

Ok, mostly I'm trying to understand the error messages - lot of this is new to me so I'm still limping up the learning curve

jamie jamison (Jun 25 2024 at 16:29):

Since I'm using a reloaded dvndb database (from the REHL7 test dataverse) I'm wondering if it's possilbe that MDC is in the database and causing that error.

jamie jamison (Jun 25 2024 at 16:33):

And that seems to solve at least that error. I'm putting that in my notes - things to consider when restoring a pervious database

jamie jamison (Jun 25 2024 at 20:41):

Actually the most exasperating issue at this point is that the default, out-of-the-box dataverseAdmin password doesn't work though this might also be because of the reloaded dvdnb database. Is that where passwords are stored?

jamie jamison (Jun 26 2024 at 18:26):

Also, sometimes payara6 seems 'stuck'. Unable to stop or restart or even kill the process.

payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Wed 2024-06-26 17:43:43 UTC; 39min ago
Process: 720 ExecStart=/usr/bin/java -jar /usr/local/payara6/glassfish/lib/client/appserver-cli.jar start-domain (code=exited, st>
CPU: 1.304s

Jun 26 17:43:38 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: com.sun.enterprise.universal.xml.MiniXmlParserException: "Xml >
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: Message: elementGetText() function expects text only elment bu>
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: Command start-domain failed.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, status=>
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'exit-code'.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 1.304s CPU time.
[rocky@ip-172-31-23-165 logs]$

Don Sizemore (Jun 27 2024 at 16:41):

@jamie jamison I've seen Payara do this on beta.dataverse.org - is there more information in server.log? on dataverseAdmin - since you have access to the DB, you could update the e-mail address associated with userid '1' in the authenticateuser table and trigger a password reset, or you could just blank the password entry?

Don Sizemore (Jun 27 2024 at 16:42):

on restoring databases / moving servers: I keep a shell script around on test machines, reset_settings.sh or some such, with a list of commonly-required curl commands to make the test server a test server when I've imported a copy of the production database

jamie jamison (Jun 27 2024 at 18:52):

The current s3 bucket related error is:
6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(2);_TimeMillis=1719511203190;_LevelValue=900;|
Could not find storage driver for: s3|#]

Lastly, I can't seem to wget the reset_settings.sh file. Is there another place to get it?

jamie jamison (Jun 27 2024 at 22:31):

<jvm-options>-Ddataverse.files.storage-driver-id=s3</jvm-options> <- is this correct, the default is 'file' but should it be changed to 's3' for the s3 buckets?

    <jvm-options>-Xmx3841m</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.bucket-name=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>

Don Sizemore (Jul 01 2024 at 16:10):

@jamie jamison the script was just something I threw together to make a copy of my production database suitable for our test server. I'm happy to share mine with you (just some API calls to change settings). if you're all S3, then yes you want storage-driver-id=s3. it may be because your label is dataverse-test-oregon instead of s3?

jamie jamison (Jul 01 2024 at 18:32):

This is what I'm a bit confused about. According to the documentation:
dataverse.files.<id>.label<?> Required label to be shown in the UI for this storage default (none)

So is it correct that the label is what's shown in the UI wouldn't the lable be 'dataverse-test-oregon'?

Philip Durbin 🚀 (Jul 03 2024 at 16:45):

jamie jamison (Jul 03 2024 at 16:55):

we're still working on it. Fresh install with reloaded database is messier then fresh install. You guys get the notes on how we did it when done.

Philip Durbin 🚀 (Jul 03 2024 at 16:56):

Don Sizemore (Jul 05 2024 at 17:55):

@jamie jamison the label appears to be a friendly name, the <id> value identifies the datastore. does that make sense?

jamie jamison (Jul 05 2024 at 18:47):

In my case probably not. I understand the label. For the test I'm using the bucket name as the friendly name. For the <id> I guess that's were I'm confused. If the default is 'file' for local files system, what is appropriate for s3 storage?

Documentation:
dataverse.files.storage-driver-id <id> Enable <id> as the default storage driver. file

Don Sizemore (Jul 09 2024 at 12:44):

@jamie jamison the <id> bit threw me as well, our is simply "s3" and suits us for now. i'll need to get more descriptive as we add datastores.

jamie jamison (Jul 09 2024 at 17:31):

Ok. Right now dataverse 6.2 is dead in the water becuase it can't access the s3 buckets and 5.14 is giving me grief since the operating system is end-of-life.

Don Sizemore (Jul 09 2024 at 19:17):

jamie jamison (Jul 09 2024 at 19:22):

jamie jamison: And they problem turned out to be a typo by me. Now back on track.

Lastly, I can't seem to wget the reset_settings.sh file. Is there another place to get it?

jamie jamison: About the storage driver issue, here is part of the domain.xml code:

<jvm-options>-Ddataverse.files.storage-driver-id=s3</jvm-options> <- is this correct, the default is 'file' but should it be changed to 's3' for the s3 buckets?

    <jvm-options>-Xmx3841m</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.bucket-name=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-orego
n.upload-redirect=true</jvm-options>

Philip Durbin 🚀 (Jul 09 2024 at 19:43):

Sorry, a reset settings script? I'm not familiar with it. You're seeing this in the guides or a release note somewhere?

jamie jamison (Jul 10 2024 at 16:33):

reset settings script was a suggestion from Don. Not in the guides or release notes.
So far still can't access the s3 buckets and unable to get letsencrypt working.

Philip Durbin 🚀 (Jul 10 2024 at 16:54):

Don Sizemore (Jul 10 2024 at 19:57):

@jamie jamison in this case your storage-driver-id for s3 is dataverse-test-oregon

Don Sizemore (Jul 10 2024 at 19:58):

the reset settings script is just a handful of curl commands I threw together in a script to set the Authority of our test server instead of our production, use the throw-away shoulder, and put up a banner warning that the test server is just a test server. it's really a way to automate the stuff I had to manually fix each time I imported a copy of our production database.

jamie jamison (Jul 10 2024 at 19:58):

I guess I didn't completely understand the documentation. Right now test is down due while trying to get letsencrypt working. But as soon as it's back I'll try that. Clarification helpful

jamie jamison (Jul 11 2024 at 02:45):

This is aside from the installation problem. Since Dataverse 5.14 on RedHat7 is barely working or staying up I've found I can still publish some of the user datasets with the API - so they can get DOIs for publishing. API works great.

jamie jamison (Jul 11 2024 at 21:48):

I anyone else having issues with 'stuck' payara6. Service can't be stopped, restarted or killed.
sudo systemctl status payara6
× payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: timeout) since Tue 2024-07-09 17:49:42 UTC; 2 days ago
CPU: 28.414s

Jul 09 17:47:41 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jul 09 17:49:41 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: start operation timed out. Terminating.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal java[728]: Waiting for domain1 to start .............................>
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, sta>
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'timeout'.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 28.414s CPU time.
lines 1-12/12 (END)

jamie jamison (Jul 12 2024 at 01:25):

I think I need one more clarification. If there are 4 s3 buckets, bucket1, bucket2, bucket3, bucket4. Does each one have to have it's own storage-driver-id?
dataverse.files.bucket1-storage-driver-id=bucket1
dataverse.files.bucket2-storage-driver-id=bucket2
dataverse.files.bucket3-storage-driver-id=bucket3
dataverse.files.bucket4-storage-driver-id=bucket4

Oliver Bertuch (Jul 12 2024 at 07:18):

These options are all mixed up. The <id> part needs to be a unique indentifier for your storage, the storage-driver-id option (mind the ".") references the driver to use for storage <id>.

dataverse.files.s3.bucket-name=juelich_data_beta
dataverse.files.s3.custom-endpoint-url=https://s3.fz-juelich.de
dataverse.files.s3.label=Jülich-DATA-Object-Store
dataverse.files.s3.path-style-access=true
dataverse.files.s3.type=s3
dataverse.files.storage-driver-id=s3

dataverse.files.bucket1.storage-driver-id=s3
dataverse.files.bucket1.s3.xxx
...
dataverse.files.bucket2.storage-driver-id=s3
dataverse.files.bucket2.s3.xxx
...
dataverse.files.bucket3.storage-driver-id=s3
dataverse.files.bucket3.s3.xxx
...
dataverse.files.bucket4.storage-driver-id=s3
dataverse.files.bucket4.s3.xxx
...

Philip Durbin 🚀 (Jul 12 2024 at 14:29):

Oliver Bertuch (Jul 12 2024 at 15:04):

One day I will make the MPCONFIG part for the storage system and take care to rename the default driver option!

Philip Durbin 🚀 (Jul 12 2024 at 15:11):

jamie jamison (Jul 12 2024 at 16:15):

I'll make an example along with the ntoes I send you. Last question to clarify. You only need 1 default-driver-id? Even with 2 or more s3 buckets?

Oliver Bertuch (Jul 12 2024 at 17:46):

Yes. That's the driver that will be assigned to a new collection if you don't specify a different one.

Oliver Bertuch (Jul 12 2024 at 17:46):

jamie jamison (Jul 12 2024 at 17:47):

Thank you for clarifying. I'll send over my example as soon as I get finished setting up the new server.

jamie jamison (Jul 12 2024 at 20:54):

Hereis some code following the example:
<jvm-options>-**Ddataverse.files.datavers-test-oregon.storage-driver-id=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverese-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-o**ptions>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>

<jvm-options>-Ddataverse.files.ssda-files.storage-driver-id=s3</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files=ssda-files</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.upload-redirect=true</jvm-options>

[#|2024-07-12T20:14:08.003+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248003;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/MN0MUZ (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-07-12T20:14:08.009+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248009;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/KHOWY6 (getDataAccessObject: Unsupported storage method.)|#]

So maybe the problem is something else? I did check the aws configuration and that looks correct.

Philip Durbin 🚀 (Jul 12 2024 at 21:00):

dataverse.files.datavers-test-oregon.storage-driver-id=s3 has a typo it it. An "e" is missing from the end of "dataverse".

jamie jamison (Jul 12 2024 at 21:01):

Philip Durbin 🚀 (Jul 12 2024 at 21:01):

Personally, I would use and <id> that doesn't have hyphens it in. But maybe they do work. Dunno.

jamie jamison (Jul 12 2024 at 21:09):

I fixed the typo. Still get the error. The reason I formatted the way that I did was an example from Oliver Bertuch (above):
dataverse.files.bucket4.storage-driver-id=s3

and this is what's in documentation:
dataverse.files.storage-driver-id <id> Enable <id> as the default storage driver. file

Philip Durbin 🚀 (Jul 12 2024 at 21:29):

I feel like there's something amiss with your config but it's late here. Maybe I can look at it with fresh eyes next week.

Oliver Bertuch (Jul 12 2024 at 21:31):

Oliver Bertuch (Jul 12 2024 at 21:32):

Educated guess on the error messages: the storage driver ID is IIRC built into the location URL. So there is no storage driver called "s3" around as per the config you posted

Oliver Bertuch (Jul 12 2024 at 21:34):

Yupp. These storage id's need to be migrated... (DataAccess.getStorageIdFromLocation)

Oliver Bertuch (Jul 12 2024 at 21:35):

jamie jamison (Jul 12 2024 at 21:50):

Could you post a sample of what that looks like? I've been using this example:
dataverse.files.bucket2.storage-driver-id=s3

Oliver Bertuch (Jul 14 2024 at 22:53):

Philip Durbin 🚀 (Jul 15 2024 at 14:07):

<jvm-options>-Ddataverse.files.ssda-files=ssda-files</jvm-options>

ssda-files is your <id> and there should be something after it like .type or .label or whatever.

jamie jamison (Jul 15 2024 at 22:58):

Another non-s3 question. We still having trouble with the install. I was asked what branch we should be using if we want to install dataverse 6.2 on rocky 9. On going issues with letsencrypt.
Should we use 243_rocky_9 or 6.2?

Kristian Allen - UCLA (Jul 15 2024 at 23:00):

Philip Durbin 🚀 (Jul 16 2024 at 13:35):

@Kristian Allen - UCLA is this a new installation? Typically that's what the Ansible scripts are used for.

jamie jamison (Jul 16 2024 at 16:52):

Yes, we are doing a fresh install because we were unable to update RHEL7 in-place. So we are doing a fresh install of 6.2 on Rocky 9

Philip Durbin 🚀 (Jul 16 2024 at 16:56):

Right, right. I don't want to tell you the wrong thing. Better wait for @Don Sizemore

Kristian Allen - UCLA (Jul 16 2024 at 17:04):

yes @jamie jamison is right -- clean install. We've been using v6.2 branch but been having a few issues , we'll submit some PRs once things settle down and we get things working.

apache:
  enabled: true
  public_fqdn:
  ssl:
    enabled: false
    remote_cert: false
    port: 443
    cert:
    interm:
    key:
    pem:
      cert:
      key:
      interm:
  port: 80

- name: certbot bonks on listen 443
  lineinfile:
    path: '{{ apache_virtualhost_dir }}/http.proxy.conf'
    regexp: '^Listen 443 https'
    state: absent
  when: letsencrypt.enabled == true

Don Sizemore (Jul 16 2024 at 17:17):

@Kristian Allen - UCLA the Apache proxy bits of Dataverse-Ansible are very, very untested. pull requests to fix problems or specific issue descriptions are most welcome.

jamie jamison (Jul 16 2024 at 18:03):

jamie jamison (Jul 16 2024 at 20:57):

@Don Sizemore We're wondering if we are using the correct branch - 243_rocky_9 or 6.2. So I don't make a pull request for the wrong branch.

Don Sizemore (Jul 17 2024 at 12:41):

@jamie jamison I could attempt to merge the 6.2 branch with Rocky9, if you like? I'm happy to schedule a meeting. The httpd portion of Dataverse-Ansible is entirely untested. LetsEncrypt refuses to issue certs for addresses ending in amazon.com, so that code is theoretical.

jamie jamison (Jul 17 2024 at 15:56):

I'm going to wait for Tim and Kristian but wondering if I should just do the fresh install with 6.0 or 6.1 and then do conventional updates?

jamie jamison (Jul 17 2024 at 16:29):

Oliver Bertuch (Jul 17 2024 at 16:32):

Don Sizemore do you think it might help @jamie jamison and @Kristian Allen - UCLA using the way I'm upgrading with migrations and all?

jamie jamison (Jul 17 2024 at 16:35):

Since the ansible script for 6.2 doesn't seem to be working I'm happy to hear suggestions

jamie jamison (Jul 17 2024 at 16:36):

I can't upgrade from 5.14 because RHEL7 hit endoflife. Have to do a fresh install (on rocky9)

jamie jamison (Jul 17 2024 at 16:44):

It's the letsencrypt part that fails. We're going to try yet again and install lets encrypt after the fact. But I'd still like to know how other people are doing fresh installs.
Successfully, that is.

Oliver Bertuch (Jul 17 2024 at 17:15):

Have you tried disabling the Let's Encrypt part from the DV Ansible? And then using another role that has been tested working on RHEL 9 or doing it manually?

Oliver Bertuch (Jul 17 2024 at 17:15):

jamie jamison (Jul 17 2024 at 17:23):

In the 6.2 ansible script, letsencrypt out-of-the-box is disabled. We're going to test again on a fresh Rocky 9 and see how that goes.

jamie jamison (Jul 17 2024 at 19:42):

Don Sizemore (Jul 17 2024 at 20:19):

@Oliver Bertuch Dataverse-Ansible works with RHEL9, but IIRC that (finally) got merged after the 6.2 branch

Don Sizemore (Jul 17 2024 at 20:19):

@jamie jamison I think I should be able to meet late Friday morning or in the afternoon? You'll appreciate that a test server here started inexplicably receiving 403 Forbiddens from AWS S3. No config changes.

Don Sizemore (Jul 17 2024 at 20:32):

jamie jamison (Jul 17 2024 at 20:58):

We're still back at getting letsencrypt working but I believe I did since we were at 5.14

jamie jamison (Jul 17 2024 at 20:59):

We're in a different time zone but morning or afternoon is fine. Are you on EST?

jamie jamison (Jul 17 2024 at 21:03):

Don Sizemore (Jul 18 2024 at 14:00):

we're EDT time but I'm getting 403 Forbiddens on a test server, so I'm interesting in comparing notes there.

jamie jamison (Jul 18 2024 at 15:13):

Don Sizemore (Jul 18 2024 at 17:05):

@jamie jamison I have a 9:45 EDT with Sonia, but other than that I'm free. On the 403 forbiddens: how are you defining your S3 credentials?

jamie jamison (Jul 18 2024 at 17:15):

Dealing with the letsencrypt and certbot somewhat distracted me from the s3 buckets. I'm getting back to that today

jamie jamison (Jul 18 2024 at 18:47):

I should add that I need a working system to deal with the s3 buckets and at this time I don't have that. I'm wondering how other people do their installations or migration installations. Ansible script or the manual installl.

jamie jamison (Jul 23 2024 at 23:51):

Philip Durbin 🚀 (Jul 24 2024 at 13:30):

jamie jamison (Jul 24 2024 at 15:52):

The above is example is from @Oliver Bertuch . I guess what I had been trying to verify is if every bucket needs it's own storage-driver - it they are all the same storage type.

Kristian Allen - UCLA (Jul 31 2024 at 20:43):

Philip Durbin 🚀 (Jul 31 2024 at 21:00):

I mean, people have pulled off similar feats but officially we ask that you step through each upgrade. (We know this is painful.)

Kristian Allen - UCLA (Jul 31 2024 at 22:11):

I don't think the step through upgrade path would work in our case as our 5.14 instance is on older version of RHEL and we have to move to a new OS (Rocky), so we were trying to determine a path forward

Kristian Allen - UCLA (Aug 01 2024 at 00:48):

I think we'll power forward with the manual migration but keep meticulous notes and then we'll make sure to write up for the community

Don Sizemore (Aug 01 2024 at 11:53):

@Kristian Allen - UCLA I personally think that your new server is a great spot to migrate/upgrade from 5.14 to 6.0, then carry on to 6.3.

Philip Durbin 🚀 (Aug 01 2024 at 13:53):

Oliver Bertuch (Aug 04 2024 at 08:50):

I have been doing experiments on a similar migration from 4.20 to 6.3 using Flyway only as well.

Oliver Bertuch (Aug 04 2024 at 08:51):

Oliver Bertuch (Aug 04 2024 at 08:52):

I can share more details about how I used a few local containers to get the job done if that helps.

jamie jamison (Aug 05 2024 at 20:58):

@Don Sizemore I'm trying to install a fresh test 5.14 Directions say "may be installed using branches tagged with that version". What is the correct syntax to tag the version?

Philip Durbin 🚀 (Aug 05 2024 at 21:06):

I might get this wrong, but I believe it refers to the tagging on the Dataverse side.

jamie jamison (Aug 05 2024 at 21:46):

Oliver Bertuch (Aug 06 2024 at 07:08):

Don Sizemore (Aug 06 2024 at 12:34):

@jamie jamison you want to clone the 5.14 branch: $ git clone -b 5.14 https://github.com/gdcc/dataverse-ansible.git 5.14

jamie jamison (Aug 06 2024 at 21:06):

ansible-playbook --connection=local -vvv -i 5.14/inventory 5.14/dataverse.pb -e "@5.14/defaults/main.yml"
**
Unfortunately failed at step: [dataverse : install java-nnn-openjdk and other packages for RedHat/Rocky] ********task path: /home/rocky/dataverse/tasks/dataverse-prereqs.yml:63

jamie jamison (Aug 06 2024 at 21:08):

"msg": "An unhandled exception occurred while templating '{'version': 11, 'home': '/usr/lib/jvm/java-{{ java.version}}'}'. Error was a <class 'ansible.errors.AnsibleError'>, original
**

Don Sizemore (Aug 07 2024 at 00:27):

@jamie jamison sigh. that again. someone else introduced a reflexive group_var without testing the role. two quick fixes could be to either edit tasks/dataverse-prereqs.yml and hard-code 11 or better yet in group_vars for the java.home variable - either should get you going. Solr mirrors tend to remove old versions, so if it dies downloading Solr, you can give it a custom Solr download url in group_vars pointing to archive.apache.org.

jamie jamison (Aug 08 2024 at 22:17):

I finally was able to restore the test.dataverse 5.14 from an aws snapshot. I'm going to keep the notes on restring earlier versions. Might be helpful for other projects.

Stream: troubleshooting

Topic: Upgrading 5.14 to 6.2

Santiago Florez (Jun 05 2024 at 16:08):

Santiago Florez (Jun 05 2024 at 16:16):

Philip Durbin 🚀 (Jun 05 2024 at 18:07):

Santiago Florez (Jun 05 2024 at 18:17):

Santiago Florez (Jun 05 2024 at 18:22):

Philip Durbin 🚀 (Jun 05 2024 at 18:29):

Santiago Florez (Jun 06 2024 at 15:59):

Philip Durbin 🚀 (Jun 06 2024 at 18:00):

Sherry Lake (Jun 07 2024 at 12:09):

Philip Durbin 🚀 (Jun 07 2024 at 13:25):

jamie jamison (Jun 13 2024 at 00:44):

jamie jamison (Jun 13 2024 at 17:24):

Philip Durbin 🚀 (Jun 13 2024 at 17:28):

jamie jamison (Jun 13 2024 at 17:33):

jamie jamison (Jun 13 2024 at 17:35):

jamie jamison (Jun 13 2024 at 17:36):

Philip Durbin 🚀 (Jun 13 2024 at 17:41):

jamie jamison (Jun 13 2024 at 17:48):

jamie jamison (Jun 13 2024 at 18:03):

Philip Durbin 🚀 (Jun 13 2024 at 18:04):

jamie jamison (Jun 13 2024 at 18:11):

Don Sizemore (Jun 13 2024 at 18:20):

jamie jamison (Jun 13 2024 at 18:23):

jamie jamison (Jun 13 2024 at 22:35):

jamie jamison (Jun 14 2024 at 00:13):

Philip Durbin 🚀 (Jun 14 2024 at 01:42):

jamie jamison (Jun 14 2024 at 04:28):

jamie jamison (Jun 14 2024 at 18:27):

Philip Durbin 🚀 (Jun 14 2024 at 18:30):

jamie jamison (Jun 14 2024 at 18:30):

jamie jamison (Jun 14 2024 at 18:31):

jamie jamison (Jun 14 2024 at 18:32):

jamie jamison (Jun 14 2024 at 18:35):

Philip Durbin 🚀 (Jun 14 2024 at 18:37):

Philip Durbin 🚀 (Jun 14 2024 at 18:39):

Philip Durbin 🚀 (Jun 14 2024 at 18:39):

Philip Durbin 🚀 (Jun 14 2024 at 18:41):

Don Sizemore (Jun 14 2024 at 18:50):

Don Sizemore (Jun 14 2024 at 18:50):

jamie jamison (Jun 14 2024 at 18:51):

jamie jamison (Jun 14 2024 at 19:58):

Philip Durbin 🚀 (Jun 14 2024 at 20:00):

jamie jamison (Jun 14 2024 at 20:02):

jamie jamison (Jun 14 2024 at 20:04):

Philip Durbin 🚀 (Jun 14 2024 at 20:06):

jamie jamison (Jun 14 2024 at 20:22):

jamie jamison (Jun 14 2024 at 20:22):

jamie jamison (Jun 14 2024 at 20:24):

jamie jamison (Jun 14 2024 at 20:24):

Philip Durbin 🚀 (Jun 14 2024 at 20:35):

jamie jamison (Jun 14 2024 at 20:35):

Philip Durbin 🚀 (Jun 14 2024 at 20:41):

jamie jamison (Jun 14 2024 at 20:42):

Philip Durbin 🚀 (Jun 14 2024 at 20:44):

jamie jamison (Jun 14 2024 at 20:49):

jamie jamison (Jun 14 2024 at 20:49):

jamie jamison (Jun 17 2024 at 17:18):

jamie jamison (Jun 17 2024 at 19:41):

jamie jamison (Jun 17 2024 at 19:57):

Don Sizemore (Jun 18 2024 at 14:40):

jamie jamison (Jun 18 2024 at 15:31):

Don Sizemore (Jun 18 2024 at 18:44):

jamie jamison (Jun 18 2024 at 19:04):

Please report any problems to https://shibboleth.atlassian.net/jira

jamie jamison (Jun 24 2024 at 18:17):

Don Sizemore (Jun 24 2024 at 19:48):

jamie jamison (Jun 24 2024 at 20:09):

Don Sizemore (Jun 25 2024 at 14:52):

jamie jamison (Jun 25 2024 at 16:26):

jamie jamison (Jun 25 2024 at 16:29):

jamie jamison (Jun 25 2024 at 16:33):

jamie jamison (Jun 25 2024 at 20:41):

jamie jamison (Jun 26 2024 at 18:26):

Don Sizemore (Jun 27 2024 at 16:41):

Don Sizemore (Jun 27 2024 at 16:42):

jamie jamison (Jun 27 2024 at 18:52):

jamie jamison (Jun 27 2024 at 22:31):

Don Sizemore (Jul 01 2024 at 16:10):