Stream: troubleshooting

Topic: Upgrading 5.14 to 6.2


view this post on Zulip Santiago Florez (Jun 05 2024 at 16:08):

Hello everyone, how are you?

I have been trying to upgrade my Dataverse installation, from 5.14 to 6.2, but I have had a problem deploying the dataverse war file. The problem is with postgres.

I have created a new database, but the problem continues. There is an image showing the problem.

image.png

Does anyone know what I should do?

view this post on Zulip Santiago Florez (Jun 05 2024 at 16:16):

I downloaded the war file from https://github.com/IQSS/dataverse/releases/tag/v6.2

image.png

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 05 2024 at 18:07):

Hi! First, you have to upgrade to 6.0, then 6.1, then 6.2.

Second, can you please show us server.log?

view this post on Zulip Santiago Florez (Jun 05 2024 at 18:17):

@Philip Durbin thanks for answer, so, I'm gonna upgrade version by version.

This is the log file.
Error_Dataverse6.2.txt

view this post on Zulip Santiago Florez (Jun 05 2024 at 18:22):

I don't understand why I have this postgres error

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 05 2024 at 18:29):

Did you do this step from https://github.com/IQSS/dataverse/releases/tag/v6.0 ?

sudo -u dataverse /usr/local/payara6/bin/asadmin create-jvm-options --add-opens=java.base/java.io=ALL-UNNAMED

I'm asking because in your log I see module java.base does not "opens java.io" to unnamed module and I'm reminded of #10068.

view this post on Zulip Santiago Florez (Jun 06 2024 at 15:59):

Hi @Philip Durbin I have decided to install the new dataverse version from scratch. Thank goodness I didn't have any information loaded yet.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 06 2024 at 18:00):

Sounds easier!

view this post on Zulip Sherry Lake (Jun 07 2024 at 12:09):

I noticed Harvard is at 6.2, but this version: v. 6.2 build "v6.2+10451+10463+10383-iqss"

Any special reason why this not "the" 6.2 release?

UVa is upgrading from 5.14 to 6.2 next week and am wondering about running into problems. There seem to be a few folks have encountered (from the Dataverse google group and here on Zulip).

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 07 2024 at 13:25):

As you might guess, #10451 #10463 and #10383 are all pull requests:

I guess these are all fixes we needed for Harvard Dataverse. They will all be included in 6.3.

view this post on Zulip jamie jamison (Jun 13 2024 at 00:44):

I'm finally able to list buckets via aws cli. ย The production bucket policy doesn't have an explicate line for listing the bucket and doesn't seem to need it.ย  But I aded the explict list ("Action": "s3:ListBucket",) to dataverse-test-oregon and suddenly I could list it with aws cli.
I'm not sure what the difference is but will do this for the new production system.
Back to shibboleth and doi setup correctly in domain.xml.

view this post on Zulip jamie jamison (Jun 13 2024 at 17:24):

s3 bucket issues. The production bucket policy (RHEL 7) doesn't have an explicate line for listing the bucket and doesn't seem to need it.ย  But I aded the explict list ("Action": "s3:ListBucket",) to dataverse-test-oregon and suddenly I could list it with aws cli. I don't know if this is a difference between the 5.14 and 6.2 but I'm going to follow this example for the production 6.2.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 13 2024 at 17:28):

Hmm, it this something we should fix in our guides?

view this post on Zulip jamie jamison (Jun 13 2024 at 17:33):

I haven't installed a clean system since 4.11 or there about. It seems that what worked up to 5.14 is different in 6.2. Bit of a jump between versions so I may have missed something. Examples for the pid/doi settings might be helpful. I'll post mine when I get through this install.

view this post on Zulip jamie jamison (Jun 13 2024 at 17:35):

Sorry, I just realized I posted in the wrong place. I'm limping through 5.14 to 6.2

view this post on Zulip jamie jamison (Jun 13 2024 at 17:36):

My new errors are actually helpful. Things I need to correct in domain.xml.
[#|2024-06-11T02:46:25.683+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985683;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/QF48PD (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
Could not find storage driver for: s3|#]

[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/MN0MUZ (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-06-11T02:46:25.693+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985693;_LevelValue=900;|
Could not find storage driver for: s3|#]

[#|2024-06-11T02:46:25.693+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985693;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/KHOWY6 (getDataAccessObject: Ulogs/server.log

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 13 2024 at 17:41):

jamie jamison said:

Sorry, I just realized I posted in the wrong place. I'm limping through 5.14 to 6.2

No worries, I moved the messages.

view this post on Zulip jamie jamison (Jun 13 2024 at 17:48):

I'm not sure what some of the domain.xml settings should be so I'm working it out via trail and error.

view this post on Zulip jamie jamison (Jun 13 2024 at 18:03):

documentation question (https://guides.dataverse.org/en/latest/installation/config.html#second-configure-your-dataverse-installation-to-use-s3-storage):
example: dataverse.files.storage-driver-id <id> Enable <id> as the defaultย storageย driver. file (<- default)
Since all my storage is in s3 buckets, should this be: Ddataverse.files.storage-driver-id=s3

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 13 2024 at 18:04):

As long as your id is "s3", yes. If your id is "foobar" it should be "foobar".

view this post on Zulip jamie jamison (Jun 13 2024 at 18:11):

I'm trying to track down storage errors:

[#|2024-06-11T02:46:25.683+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985683;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/QF48PD (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
Could not find storage driver for: s3|#]

view this post on Zulip Don Sizemore (Jun 13 2024 at 18:20):

@Philip Durbin @jamie jamison Dataverse doesn't need listBucket but the CLI does, if you want to pull a listing.

view this post on Zulip jamie jamison (Jun 13 2024 at 18:23):

Perhaps inaccurately I was using aws cli to check if the buckets were accessable.
Right now I'm stuck at the point where payara6 is running but dataverse pages aren't loading.

view this post on Zulip jamie jamison (Jun 13 2024 at 22:35):

I managed to break the test system to the point it no longer loads. Since I thought it was idempotent I tried running the ansible script again. I stopped and disabled payara6, ran the script and got this message:
ASK [dataverse : fire off installer] **********************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "/usr/bin/python3 /tmp/dvinstall/install.py -f --config_file=default.config --noninteractive > /tmp/dvinstall/install.out 2>&1", "delta": "0:01:26.506903", "end": "2024-06-13 22:32:09.208629", "msg": "non-zero return code", "rc": 1, "start": "2024-06-13 22:30:42.701726", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

So I realize now it's not idempotent. I can rebuild from scratch if necessary, had some practice. But I'm wondering if dropping the dataverse database table would help?

view this post on Zulip jamie jamison (Jun 14 2024 at 00:13):

To rerun the ansible script: 1) stop and disable payara6, 2) drop (and probably dump) the dvndb, 3) probably doesn't hurt to delete the domain.xml file and then rerun the ansible script.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 01:42):

Go go go!

view this post on Zulip jamie jamison (Jun 14 2024 at 04:28):

Well that was fun while it lasted.ย  Now can't start or restart payara6

view this post on Zulip jamie jamison (Jun 14 2024 at 18:27):

Anyone out there with postgres experience?
I was able to reload the dvndb database but I have to set (or reset) the privlidges. I 'm trying to figure out what '=Tc/postgres postgres=CTc/postgres dvnuser=CTc/postgres ' so I can add that to dvndb.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 18:30):

=Tc/postgres? Where are you seeing that?

view this post on Zulip jamie jamison (Jun 14 2024 at 18:30):

I log into postgres and display the tables
dvndb | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
dvndb_bkup | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =Tc/postgres +
| | | | | postgres=CTc/postgres+
| | | | | dvnuser=CTc/postgres
dvndp | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(6 rows)

And yes, dvndp is a mispelling error

view this post on Zulip jamie jamison (Jun 14 2024 at 18:31):

image.png

view this post on Zulip jamie jamison (Jun 14 2024 at 18:32):

Just found Create, connect and temporary (stackexchange)

view this post on Zulip jamie jamison (Jun 14 2024 at 18:35):

After running ansible it seems you have do drop or rename that initial dvndb, make a new dvndb and then reload, and the add privlidges. (By the way did I mention that I lost my sys admin support a week before RHEL7 end-of-life, hense the frantic googling to rebuild on a new system.)

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 18:37):

I see, like this:

 dvndb      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 dvndb_bkup | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =Tc/postgres         +
            |          |          |         |         | postgres=CTc/postgres+
            |          |          |         |         | dvnuser=CTc/postgres
 dvndp      | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 postgres   | postgres | UTF8     | C.UTF-8 | C.UTF-8 |
 template0  | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =c/postgres          +
            |          |          |         |         | postgres=CTc/postgres
 template1  | postgres | UTF8     | C.UTF-8 | C.UTF-8 | =c/postgres          +
            |          |          |         |         | postgres=CTc/postgres
(6 rows)

I'm using fenced code blocks above. Very useful here in Zulip and in GitHub. You could try this if you want:

```
test
```

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 18:39):

But yeah, I'm not familiar with =Tc/postgres.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 18:39):

Check this out: https://www.postgresql.org/docs/current/ddl-priv.html#PRIVILEGE-ABBREVS-TABLE

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 18:41):

and https://dba.stackexchange.com/questions/264441/is-my-database-secure-what-does-tc-postgres-allow/264445#264445 seem to say the same thing, that "T" is for TEMPORARY and "c" is for CONNECT.

view this post on Zulip Don Sizemore (Jun 14 2024 at 18:50):

@jamie jamison IIRC postgres ownership and privileges may need to be modified table-by-table

view this post on Zulip Don Sizemore (Jun 14 2024 at 18:50):

@jamie jamison if you dump and import as Phil suggests you can use the -O (no "owner") flag which may greatly simplify the process?

view this post on Zulip jamie jamison (Jun 14 2024 at 18:51):

That makes rebuilding the database a bit more daunting I'll go back and try that - dump from the previous test system and bring over to new.

view this post on Zulip jamie jamison (Jun 14 2024 at 19:58):

Dropped the database and reloaded from a dump. It now matches up to what the old server postgres looked like but still not getting the dataverse front page. May take waiting longer

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 20:00):

Anything at the equivalent of https://demo.dataverse.org/api/info/version ?

view this post on Zulip jamie jamison (Jun 14 2024 at 20:02):

I'll try that as soon as I can get payara6 to start again

view this post on Zulip jamie jamison (Jun 14 2024 at 20:04):

404 page not found, requested service not available

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 20:06):

any errors in server.log?

view this post on Zulip jamie jamison (Jun 14 2024 at 20:22):

But I do get the payara page

view this post on Zulip jamie jamison (Jun 14 2024 at 20:22):

I'm digging in the server log now

view this post on Zulip jamie jamison (Jun 14 2024 at 20:24):

[#|2024-06-14T00:19:37.128+0000|SEVERE|Payara 6.2023.8|edu.harvard.iq.dataverse.mydata.DataRetrieverAPI|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(5);_TimeMillis=1718324377128;_LevelValue=1000;|
Sorry, nothing was found for these roles: Admin, File Downloader, Dataverse + Dataset Creator, Dataverse Creator, Dataset Creator, Contributor, Curator, Member|#]

view this post on Zulip jamie jamison (Jun 14 2024 at 20:24):

Looks like the database is messed up somehow

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 20:35):

Huh. Yeah, sounds like it. :disappointed:

view this post on Zulip jamie jamison (Jun 14 2024 at 20:35):

There has to be a way to reload or restore a database

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 20:41):

I mean, a long time ago I used scripts like this to dump and restore:

10 years ago according to git. But something like that should work, I would think.

view this post on Zulip jamie jamison (Jun 14 2024 at 20:42):

that looks like what I did (will paste the commands in)

view this post on Zulip Philip Durbin ๐Ÿš€ (Jun 14 2024 at 20:44):

It looks like @Deirdre Kirmis has experience with database backup and recovery: https://groups.google.com/g/dataverse-community/c/kTEUkxHB_ZM/m/T6aRZ9SpCwAJ

view this post on Zulip jamie jamison (Jun 14 2024 at 20:49):

Ok, this is probably not an official way to fix things but after reloading the database I reran ansible and now the test looks ok.
At that point I did as @Philip Durbin suggested and ran with "/api/info/version" and got status 'ok' and version '6.2'

view this post on Zulip jamie jamison (Jun 14 2024 at 20:49):

Is it possible that rerunning the ansible script fixed the missing roles?

view this post on Zulip jamie jamison (Jun 17 2024 at 17:18):

I that I'm following the directions exactly from https://guides.dataverse.org/en/latest/installation/config.html#second-configure-your-dataverse-installation-to-use-s3-storage

Example :
<jvm-options>-Ddataverse.files.storage-driver-id=file</jvm-options>
<jvm-options>-Xmx3841m</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>

If I restart and check status of payar6 I get:
sudo systemctl status payara6
ร— payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Mon 2024-06-17 17:07:10 UTC; 8min ago
Process: 43806 ExecStart=/usr/bin/java -jar /usr/local/payara6/glassfish/lib/client/appserver-cli.jar start-domain (code=exited, status=1/FAILURE)
CPU: 2.396s

Jun 17 17:07:08 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal java[43806]: Port 7676 is in use
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal java[43806]: Command start-domain failed.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, status=1/FAILURE
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'exit-code'.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 2.396s CPU time.
[rocky@ip-172-31-23-165 ~]$

And, oddly the test system looks like it's up, though with no buckets until I reboot the system then I get a 500

view this post on Zulip jamie jamison (Jun 17 2024 at 19:41):

And another question. When it's up I'm not able to log into the test dataverse with the default admin account. In the ansible script it looks like the admin account is 'dataverseAdmin' and thepassword is "admin1" which gets the error "username, email address, or password you entered is invalid."

view this post on Zulip jamie jamison (Jun 17 2024 at 19:57):

And for shibboleth, following https://guides.dataverse.org/en/latest/installation/shibboleth.html#install-shibboleth-via-yum
I'm stuck at the error: Error: GPG check FAILED

view this post on Zulip Don Sizemore (Jun 18 2024 at 14:40):

is there anything more descriptive in the Payara server.log?
on Shibboleth: worst case you can set gpg_enable=0 in the repo config, but better to import the GPG key

view this post on Zulip jamie jamison (Jun 18 2024 at 15:31):

For the payara server log part of the problem is I'm not sure what to look for. I'm partially relying on the time stamp.
For shib I'm obviously not importing the GPG correctly. I have back-to-back patron consults till noon and then will dig back in.

view this post on Zulip Don Sizemore (Jun 18 2024 at 18:44):

my shibboleth.repo is looking at

gpgkey=https://shibboleth.net/downloads/service-provider/RPMS/repomd.xml.key
https://shibboleth.net/downloads/service-provider/RPMS/cantor.repomd.xml.key

view this post on Zulip jamie jamison (Jun 18 2024 at 19:04):

That is the same thing that I have:
**

[shibboleth]
name=Shibboleth (rockylinux9)

Please report any problems to https://shibboleth.atlassian.net/jira

type=rpm-md
mirrorlist=https://shibboleth.net/cgi-bin/mirrorlist.cgi/rockylinux9
gpgcheck=1
gpgkey=https://shibboleth.net/downloads/service-provider/RPMS/repomd.xml.key
https://shibboleth.net/downloads/service-provider/RPMS/cantor.repomd.xml.key
enabled=1
**

view this post on Zulip jamie jamison (Jun 24 2024 at 18:17):

One thing I am running into is that out-of-the-box I can't log into Dataverse. I've tried to unblock admin in the main.yml script but that didn't help.
Is there a way via the api to change the dataverseAdmin password? I'm reading through the documenation but havent found that yet

view this post on Zulip Don Sizemore (Jun 24 2024 at 19:48):

@jamie jamison if you're using Dataverse-Ansible you can set it as a group_var?

view this post on Zulip jamie jamison (Jun 24 2024 at 20:09):

Ok, I'll look into that. I did try setting the dataverseAdmin password in the ansible script.

Other thing: even before I try to add s3 buckets I get this:
exception

jakarta.servlet.ServletException: /dataset.xhtml @595,164 rendered="#{settingsWrapper.makeDataCountDisplayEnabled and DatasetPage.doi}": The class 'edu.harvard.iq.dataverse.DatasetPage' does not have the property 'doi'.
root cause

jakarta.el.PropertyNotFoundException: /dataset.xhtml @595,164 rendered="#{settingsWrapper.makeDataCountDisplayEnabled and DatasetPage.doi}": The class 'edu.harvard.iq.dataverse.DatasetPage' does not have the property 'doi'.

I'm wondering where makeDataCount is enabled - trying to find in main.yml

view this post on Zulip Don Sizemore (Jun 25 2024 at 14:52):

@jamie jamison I don't think Ansible knows about MDC yet?

view this post on Zulip jamie jamison (Jun 25 2024 at 16:26):

Ok, mostly I'm trying to understand the error messages - lot of this is new to me so I'm still limping up the learning curve

view this post on Zulip jamie jamison (Jun 25 2024 at 16:29):

Since I'm using a reloaded dvndb database (from the REHL7 test dataverse) I'm wondering if it's possilbe that MDC is in the database and causing that error.

view this post on Zulip jamie jamison (Jun 25 2024 at 16:33):

And I may have found something in 6.2 documentaation - https://guides.dataverse.org/en/latest/admin/make-data-count.html#enable-or-disable-display-of-make-data-count-metrics

And that seems to solve at least that error. I'm putting that in my notes - things to consider when restoring a pervious database

view this post on Zulip jamie jamison (Jun 25 2024 at 20:41):

Actually the most exasperating issue at this point is that the default, out-of-the-box dataverseAdmin password doesn't work though this might also be because of the reloaded dvdnb database. Is that where passwords are stored?

view this post on Zulip jamie jamison (Jun 26 2024 at 18:26):

Also, sometimes payara6 seems 'stuck'. Unable to stop or restart or even kill the process.

payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Wed 2024-06-26 17:43:43 UTC; 39min ago
Process: 720 ExecStart=/usr/bin/java -jar /usr/local/payara6/glassfish/lib/client/appserver-cli.jar start-domain (code=exited, st>
CPU: 1.304s

Jun 26 17:43:38 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: com.sun.enterprise.universal.xml.MiniXmlParserException: "Xml >
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: Message: elementGetText() function expects text only elment bu>
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: Command start-domain failed.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, status=>
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'exit-code'.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 1.304s CPU time.
[rocky@ip-172-31-23-165 logs]$

view this post on Zulip Don Sizemore (Jun 27 2024 at 16:41):

@jamie jamison I've seen Payara do this on beta.dataverse.org - is there more information in server.log? on dataverseAdmin - since you have access to the DB, you could update the e-mail address associated with userid '1' in the authenticateuser table and trigger a password reset, or you could just blank the password entry?

view this post on Zulip Don Sizemore (Jun 27 2024 at 16:42):

on restoring databases / moving servers: I keep a shell script around on test machines, reset_settings.sh or some such, with a list of commonly-required curl commands to make the test server a test server when I've imported a copy of the production database

view this post on Zulip jamie jamison (Jun 27 2024 at 18:52):

And they problem turned out to be a typo by me. Now back on track.

The current s3 bucket related error is:
6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(2);_TimeMillis=1719511203190;_LevelValue=900;|
ย ย Could not find storage driver for: s3|#]

Lastly, I can't seem to wget the reset_settings.sh file. Is there another place to get it?

view this post on Zulip jamie jamison (Jun 27 2024 at 22:31):

About the storage driver issue, here is part of the domain.xml code:

<jvm-options>-Ddataverse.files.storage-driver-id=s3</jvm-options> <- is this correct, the default is 'file' but should it be changed to 's3' for the s3 buckets?

    <jvm-options>-Xmx3841m</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.bucket-name=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>

view this post on Zulip Don Sizemore (Jul 01 2024 at 16:10):

@jamie jamison the script was just something I threw together to make a copy of my production database suitable for our test server. I'm happy to share mine with you (just some API calls to change settings). if you're all S3, then yes you want storage-driver-id=s3. it may be because your label is dataverse-test-oregon instead of s3?

view this post on Zulip jamie jamison (Jul 01 2024 at 18:32):

This is what I'm a bit confused about. According to the documentation:
dataverse.files.<id>.label<?> Requiredย label to be shown in the UI for thisย storage default (none)

So is it correct that the label is what's shown in the UI wouldn't the lable be 'dataverse-test-oregon'?

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 03 2024 at 16:45):

Did you figure it out, what's shown in the UI? (I'm not sure, myself.)

view this post on Zulip jamie jamison (Jul 03 2024 at 16:55):

we're still working on it. Fresh install with reloaded database is messier then fresh install. You guys get the notes on how we did it when done.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 03 2024 at 16:56):

Great, much appreciated. Good luck!

view this post on Zulip Don Sizemore (Jul 05 2024 at 17:55):

@jamie jamison the label appears to be a friendly name, the <id> value identifies the datastore. does that make sense?

view this post on Zulip jamie jamison (Jul 05 2024 at 18:47):

In my case probably not. I understand the label. For the test I'm using the bucket name as the friendly name. For the <id> I guess that's were I'm confused. If the default is 'file' for local files system, what is appropriate for s3 storage?

Documentation:
dataverse.files.storage-driver-id <id> Enable <id> as the defaultย storageย driver. file

view this post on Zulip Don Sizemore (Jul 09 2024 at 12:44):

@jamie jamison the <id> bit threw me as well, our is simply "s3" and suits us for now. i'll need to get more descriptive as we add datastores.

view this post on Zulip jamie jamison (Jul 09 2024 at 17:31):

Ok. Right now dataverse 6.2 is dead in the water becuase it can't access the s3 buckets and 5.14 is giving me grief since the operating system is end-of-life.

view this post on Zulip Don Sizemore (Jul 09 2024 at 19:17):

@jamie jamison remind me of the error you get accessing the buckets from 6.2?

view this post on Zulip jamie jamison (Jul 09 2024 at 19:22):

jamie jamison:ย And they problem turned out to be a typo by me. Now back on track.

The current s3 bucket related error is:
6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(2);_TimeMillis=1719511203190;_LevelValue=900;|
ย ย Could not find storage driver for: s3|#]

Lastly, I can't seem to wget theย reset_settings.shย file. Is there another place to get it?

jamie jamison:ย About the storage driver issue, here is part of the domain.xml code:

<jvm-options>-Ddataverse.files.storage-driver-id=s3</jvm-options> <- is this correct, the default is 'file' but should it be changed to 's3' for the s3 buckets?

    <jvm-options>-Xmx3841m</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.bucket-name=dataverse-test-oregon</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
    <jvm-options>-Ddataverse.files.dataverse-test-orego
n.upload-redirect=true</jvm-options>

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 09 2024 at 19:43):

I'm glad fixing the typo is helping!

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 09 2024 at 19:43):

Sorry, a reset settings script? I'm not familiar with it. You're seeing this in the guides or a release note somewhere?

view this post on Zulip jamie jamison (Jul 10 2024 at 16:33):

reset settings script was a suggestion from Don. Not in the guides or release notes.
So far still can't access the s3 buckets and unable to get letsencrypt working.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 10 2024 at 16:54):

Oh, I understand now, thanks

view this post on Zulip Don Sizemore (Jul 10 2024 at 19:57):

@jamie jamison in this case your storage-driver-id for s3 is dataverse-test-oregon

view this post on Zulip Don Sizemore (Jul 10 2024 at 19:58):

the reset settings script is just a handful of curl commands I threw together in a script to set the Authority of our test server instead of our production, use the throw-away shoulder, and put up a banner warning that the test server is just a test server. it's really a way to automate the stuff I had to manually fix each time I imported a copy of our production database.

view this post on Zulip jamie jamison (Jul 10 2024 at 19:58):

I guess I didn't completely understand the documentation. Right now test is down due while trying to get letsencrypt working. But as soon as it's back I'll try that. Clarification helpful

view this post on Zulip jamie jamison (Jul 11 2024 at 02:45):

This is aside from the installation problem. Since Dataverse 5.14 on RedHat7 is barely working or staying up I've found I can still publish some of the user datasets with the API - so they can get DOIs for publishing. API works great.

view this post on Zulip jamie jamison (Jul 11 2024 at 21:48):

I anyone else having issues with 'stuck' payara6. Service can't be stopped, restarted or killed.
sudo systemctl status payara6
ร— payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: timeout) since Tue 2024-07-09 17:49:42 UTC; 2 days ago
CPU: 28.414s

Jul 09 17:47:41 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jul 09 17:49:41 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: start operation timed out. Terminating.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal java[728]: Waiting for domain1 to start .............................>
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, sta>
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'timeout'.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 28.414s CPU time.
lines 1-12/12 (END)

view this post on Zulip jamie jamison (Jul 12 2024 at 01:25):

Don Sizemore said:

jamie jamison in this case your storage-driver-id for s3 is dataverse-test-oregon

I think I need one more clarification. If there are 4 s3 buckets, bucket1, bucket2, bucket3, bucket4. Does each one have to have it's own storage-driver-id?
dataverse.files.bucket1-storage-driver-id=bucket1
dataverse.files.bucket2-storage-driver-id=bucket2
dataverse.files.bucket3-storage-driver-id=bucket3
dataverse.files.bucket4-storage-driver-id=bucket4

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 07:18):

These options are all mixed up. The <id> part needs to be a unique indentifier for your storage, the storage-driver-id option (mind the ".") references the driver to use for storage <id>.

Here's a complete example of an S3 storage from our beta env:

dataverse.files.s3.bucket-name=juelich_data_beta
dataverse.files.s3.custom-endpoint-url=https://s3.fz-juelich.de
dataverse.files.s3.label=Jรผlich-DATA-Object-Store
dataverse.files.s3.path-style-access=true
dataverse.files.s3.type=s3
dataverse.files.storage-driver-id=s3

So in your case, this needs to be (assuming all of them are S3):

dataverse.files.bucket1.storage-driver-id=s3
dataverse.files.bucket1.s3.xxx
...
dataverse.files.bucket2.storage-driver-id=s3
dataverse.files.bucket2.s3.xxx
...
dataverse.files.bucket3.storage-driver-id=s3
dataverse.files.bucket3.s3.xxx
...
dataverse.files.bucket4.storage-driver-id=s3
dataverse.files.bucket4.s3.xxx
...

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 12 2024 at 14:29):

The docs say "dataverse.files.storage-driver-id" allows you to "Enable <id> as the default storage driver." -- https://guides.dataverse.org/en/6.3/installation/config.html#list-of-s3-storage-options

So you'll want to pick one of your four buckets as the <id>, like this:

dataverse.files.storage-driver-id=bucket3

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 15:04):

Dang you're right! I mixed that up myself! See, it's fricking complicated!

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 15:04):

One day I will make the MPCONFIG part for the storage system and take care to rename the default driver option!

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 12 2024 at 15:11):

So friggin' complicated. Need more docs, I guess, short term.

view this post on Zulip jamie jamison (Jul 12 2024 at 16:15):

I'll make an example along with the ntoes I send you. Last question to clarify. You only need 1 default-driver-id? Even with 2 or more s3 buckets?

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 17:46):

Yes. That's the driver that will be assigned to a new collection if you don't specify a different one.

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 17:46):

There's only 1 instance wide default :-)

view this post on Zulip jamie jamison (Jul 12 2024 at 17:47):

Thank you for clarifying. I'll send over my example as soon as I get finished setting up the new server.

view this post on Zulip jamie jamison (Jul 12 2024 at 20:54):

Hereis some code following the example:
<jvm-options>-**Ddataverse.files.datavers-test-oregon.storage-driver-id=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverese-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-o**ptions>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>

<jvm-options>-Ddataverse.files.ssda-files.storage-driver-id=s3</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files=ssda-files</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.upload-redirect=true</jvm-options>

and I still get:
[#|2024-07-12T20:14:08.003+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248003;_LevelValue=900;|
Could not find storage driver for: s3|#]

[#|2024-07-12T20:14:08.003+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248003;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/MN0MUZ (getDataAccessObject: Unsupported storage method.)|#]

[#|2024-07-12T20:14:08.008+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248008;_LevelValue=900;|
Could not find storage driver for: s3|#]

[#|2024-07-12T20:14:08.009+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248009;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/KHOWY6 (getDataAccessObject: Unsupported storage method.)|#]

So maybe the problem is something else? I did check the aws configuration and that looks correct.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 12 2024 at 21:00):

dataverse.files.datavers-test-oregon.storage-driver-id=s3 has a typo it it. An "e" is missing from the end of "dataverse".

view this post on Zulip jamie jamison (Jul 12 2024 at 21:01):

well that embarrassing for an old secretary... Will go retry now

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 12 2024 at 21:01):

Personally, I would use and <id> that doesn't have hyphens it in. But maybe they do work. Dunno.

view this post on Zulip jamie jamison (Jul 12 2024 at 21:09):

I fixed the typo. Still get the error. The reason I formatted the way that I did was an example from Oliver Bertuch (above):
dataverse.files.bucket4.storage-driver-id=s3

and this is what's in documentation:
dataverse.files.storage-driver-id <id> Enable <id> as the defaultย storageย driver. file

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 12 2024 at 21:29):

I feel like there's something amiss with your config but it's late here. Maybe I can look at it with fresh eyes next week.

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 21:31):

There's also the missing .label thing

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 21:32):

Educated guess on the error messages: the storage driver ID is IIRC built into the location URL. So there is no storage driver called "s3" around as per the config you posted

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 21:34):

Yupp. These storage id's need to be migrated... (DataAccess.getStorageIdFromLocation)

view this post on Zulip Oliver Bertuch (Jul 12 2024 at 21:35):

Or you configure a storage driver that has an id "s3" :wink:

view this post on Zulip jamie jamison (Jul 12 2024 at 21:50):

Could you post a sample of what that looks like? I've been using this example:
dataverse.files.bucket2.storage-driver-id=s3

view this post on Zulip Oliver Bertuch (Jul 14 2024 at 22:53):

Related: https://github.com/IQSS/dataverse/issues/10684

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 15 2024 at 14:07):

@jamie jamison this one doesn't look right, for example:

<jvm-options>-Ddataverse.files.ssda-files=ssda-files</jvm-options>

ssda-files is your <id> and there should be something after it like .type or .label or whatever.

view this post on Zulip jamie jamison (Jul 15 2024 at 22:58):

Another non-s3 question. We still having trouble with the install. I was asked what branch we should be using if we want to install dataverse 6.2 on rocky 9. On going issues with letsencrypt.
Should we use 243_rocky_9 or 6.2?

view this post on Zulip Kristian Allen - UCLA (Jul 15 2024 at 23:00):

Thanks @jamie, if we would like to use the community ansible scripts for dvn 6.2 on rocky 9, what is the suggested branch to use?
https://github.com/gdcc/dataverse-ansible/tree/v6.2 or
https://github.com/gdcc/dataverse-ansible/tree/243_rocky_9

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 16 2024 at 13:35):

@Kristian Allen - UCLA is this a new installation? Typically that's what the Ansible scripts are used for.

view this post on Zulip jamie jamison (Jul 16 2024 at 16:52):

Yes, we are doing a fresh install because we were unable to update RHEL7 in-place. So we are doing a fresh install of 6.2 on Rocky 9

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 16 2024 at 16:56):

Right, right. I don't want to tell you the wrong thing. Better wait for @Don Sizemore

view this post on Zulip Kristian Allen - UCLA (Jul 16 2024 at 17:04):

yes @jamie jamison is right -- clean install. We've been using v6.2 branch but been having a few issues , we'll submit some PRs once things settle down and we get things working.

The current blocker is multiple processes grabbing the same port:

Following the readme, we updated the main.yml

apache:
  enabled: true
  public_fqdn:
  ssl:
    enabled: false
    remote_cert: false
    port: 443
    cert:
    interm:
    key:
    pem:
      cert:
      key:
      interm:
  port: 80

But is 443 or 80 redefined anywhere, like in dataverse-apache.yml?

- name: certbot bonks on listen 443
  lineinfile:
    path: '{{ apache_virtualhost_dir }}/http.proxy.conf'
    regexp: '^Listen 443 https'
    state: absent
  when: letsencrypt.enabled == true

view this post on Zulip Don Sizemore (Jul 16 2024 at 17:17):

@Kristian Allen - UCLA the Apache proxy bits of Dataverse-Ansible are very, very untested. pull requests to fix problems or specific issue descriptions are most welcome.

view this post on Zulip jamie jamison (Jul 16 2024 at 18:03):

Well we're definately testing them at ucla

view this post on Zulip jamie jamison (Jul 16 2024 at 20:57):

@Don Sizemore We're wondering if we are using the correct branch - 243_rocky_9 or 6.2. So I don't make a pull request for the wrong branch.

view this post on Zulip Don Sizemore (Jul 17 2024 at 12:41):

@jamie jamison I could attempt to merge the 6.2 branch with Rocky9, if you like? I'm happy to schedule a meeting. The httpd portion of Dataverse-Ansible is entirely untested. LetsEncrypt refuses to issue certs for addresses ending in amazon.com, so that code is theoretical.

view this post on Zulip jamie jamison (Jul 17 2024 at 15:56):

I'm going to wait for Tim and Kristian but wondering if I should just do the fresh install with 6.0 or 6.1 and then do conventional updates?


view this post on Zulip jamie jamison (Jul 17 2024 at 16:29):

How are most people doing a fresh install?

view this post on Zulip Oliver Bertuch (Jul 17 2024 at 16:32):

Don Sizemore do you think it might help @jamie jamison and @Kristian Allen - UCLA using the way I'm upgrading with migrations and all?

view this post on Zulip jamie jamison (Jul 17 2024 at 16:35):

Since the ansible script for 6.2 doesn't seem to be working I'm happy to hear suggestions

view this post on Zulip jamie jamison (Jul 17 2024 at 16:36):

I can't upgrade from 5.14 because RHEL7 hit endoflife. Have to do a fresh install (on rocky9)

view this post on Zulip jamie jamison (Jul 17 2024 at 16:44):

It's the letsencrypt part that fails. We're going to try yet again and install lets encrypt after the fact. But I'd still like to know how other people are doing fresh installs.
Successfully, that is.

view this post on Zulip Oliver Bertuch (Jul 17 2024 at 17:15):

Have you tried disabling the Let's Encrypt part from the DV Ansible? And then using another role that has been tested working on RHEL 9 or doing it manually?

view this post on Zulip Oliver Bertuch (Jul 17 2024 at 17:15):

Just to get stuff going...

view this post on Zulip jamie jamison (Jul 17 2024 at 17:23):

In the 6.2 ansible script, letsencrypt out-of-the-box is disabled. We're going to test again on a fresh Rocky 9 and see how that goes.

view this post on Zulip jamie jamison (Jul 17 2024 at 19:42):

@Don Sizemore Would you have any time to meet on a friday?

view this post on Zulip Don Sizemore (Jul 17 2024 at 20:19):

@Oliver Bertuch Dataverse-Ansible works with RHEL9, but IIRC that (finally) got merged after the 6.2 branch

view this post on Zulip Don Sizemore (Jul 17 2024 at 20:19):

@jamie jamison I think I should be able to meet late Friday morning or in the afternoon? You'll appreciate that a test server here started inexplicably receiving 403 Forbiddens from AWS S3. No config changes.

view this post on Zulip Don Sizemore (Jul 17 2024 at 20:32):

@jamie jamison on your 403 Forbiddens: did you update your CORS rules per the CORS section of https://github.com/IQSS/dataverse/releases/v5.12 ?

view this post on Zulip jamie jamison (Jul 17 2024 at 20:58):

We're still back at getting letsencrypt working but I believe I did since we were at 5.14

view this post on Zulip jamie jamison (Jul 17 2024 at 20:59):

We're in a different time zone but morning or afternoon is fine. Are you on EST?

view this post on Zulip jamie jamison (Jul 17 2024 at 21:03):

12:30 or 1pm our time? I'll figure out pst to your time

view this post on Zulip Don Sizemore (Jul 18 2024 at 14:00):

we're EDT time but I'm getting 403 Forbiddens on a test server, so I'm interesting in comparing notes there.

view this post on Zulip jamie jamison (Jul 18 2024 at 15:13):

What's a good time for you?

view this post on Zulip Don Sizemore (Jul 18 2024 at 17:05):

@jamie jamison I have a 9:45 EDT with Sonia, but other than that I'm free. On the 403 forbiddens: how are you defining your S3 credentials?

view this post on Zulip jamie jamison (Jul 18 2024 at 17:15):

Dealing with the letsencrypt and certbot somewhat distracted me from the s3 buckets. I'm getting back to that today

view this post on Zulip jamie jamison (Jul 18 2024 at 18:47):

I should add that I need a working system to deal with the s3 buckets and at this time I don't have that. I'm wondering how other people do their installations or migration installations. Ansible script or the manual installl.

view this post on Zulip jamie jamison (Jul 23 2024 at 23:51):

I have a working test dataverse and some worI hking s3 buckets. I'm using Oliver Oliver Bertuch's example:
dataverse.files.bucket1.storage-driver-id=s3
dataverse.files.bucket1.s3.xxx
...
dataverse.files.bucket2.storage-driver-id=s3
dataverse.files.bucket2.s3.xxx
...
dataverse.files.bucket3.storage-driver-id=s3
dataverse.files.bucket3.s3.xxx
...
dataverse.files.bucket4.storage-driver-id=s3
dataverse.files.bucket4.s3.xxx

That each bucket needs to have it's own storage-id

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 24 2024 at 13:30):

It still looks a little off to me but if it's working, great!

view this post on Zulip jamie jamison (Jul 24 2024 at 15:52):

The above is example is from @Oliver Bertuch . I guess what I had been trying to verify is if every bucket needs it's own storage-driver - it they are all the same storage type.

view this post on Zulip Kristian Allen - UCLA (Jul 31 2024 at 20:43):

Our current theory is to do a fresh install of 6.3 via Ansible (this is working), then migrate the database from 5.14 to 6.3 via Flyway. Does this make sense or does anyone see any areas to be aware of?
The thinking was https://github.com/IQSS/dataverse/blob/0d279573bb8d7c96d7a4a1dc4b66b2258059dfba/src/main/java/edu/harvard/iq/dataverse/flyway/StartupFlywayMigrator.java#L16 can be run on own?
Or is there a cleaner way to do this via pulling objects out and putting back in via DVN API?

view this post on Zulip Philip Durbin ๐Ÿš€ (Jul 31 2024 at 21:00):

I mean, people have pulled off similar feats but officially we ask that you step through each upgrade. (We know this is painful.)

view this post on Zulip Kristian Allen - UCLA (Jul 31 2024 at 22:11):

I don't think the step through upgrade path would work in our case as our 5.14 instance is on older version of RHEL and we have to move to a new OS (Rocky), so we were trying to determine a path forward

view this post on Zulip Kristian Allen - UCLA (Aug 01 2024 at 00:48):

I think we'll power forward with the manual migration but keep meticulous notes and then we'll make sure to write up for the community

view this post on Zulip Don Sizemore (Aug 01 2024 at 11:53):

@Kristian Allen - UCLA I personally think that your new server is a great spot to migrate/upgrade from 5.14 to 6.0, then carry on to 6.3.

view this post on Zulip Philip Durbin ๐Ÿš€ (Aug 01 2024 at 13:53):

@Kristian Allen - UCLA you might be interested in this "Our experience on upgrading dataverse from 4.2.2 to 6.1 in a non-standard way" thread: https://groups.google.com/g/dataverse-community/c/BUO37-reWIs/m/18SK0DZlAQAJ

view this post on Zulip Oliver Bertuch (Aug 04 2024 at 08:50):

I have been doing experiments on a similar migration from 4.20 to 6.3 using Flyway only as well.

view this post on Zulip Oliver Bertuch (Aug 04 2024 at 08:51):

You can find the necessary extra migrations you will need when using Flyway only here: https://jugit.fz-juelich.de/fdm/dev/dataverse/-/tree/juelich-data-upgrade/src/main/resources/db/extra?ref_type=heads

Obviously, you'll only need the 5.14+ migrations, not the others.

view this post on Zulip Oliver Bertuch (Aug 04 2024 at 08:52):

I've also added the Flyway Maven Plugin in this branch to the DV app POM. https://jugit.fz-juelich.de/fdm/dev/dataverse/-/blob/juelich-data-upgrade/pom.xml?ref_type=heads#L755

view this post on Zulip Oliver Bertuch (Aug 04 2024 at 08:52):

I can share more details about how I used a few local containers to get the job done if that helps.

view this post on Zulip jamie jamison (Aug 05 2024 at 20:58):

Don Sizemore said:

Kristian Allen - UCLA I personally think that your new server is a great spot to migrate/upgrade from 5.14 to 6.0, then carry on to 6.3.

@Don Sizemore I'm trying to install a fresh test 5.14 Directions say "may be installed using branches tagged with that version". What is the correct syntax to tag the version?

Would this look like: **

git clone https://github.com/GlobalDataverseCommunityConsortium/dataverse-ansible.git dataverse-5.14
**

view this post on Zulip Philip Durbin ๐Ÿš€ (Aug 05 2024 at 21:06):

I might get this wrong, but I believe it refers to the tagging on the Dataverse side.

I think you'd change this from "version:6.3" to whatever version you want but I'm not sure if it "just works": https://github.com/gdcc/dataverse-ansible/blob/c1e21f255dd00bf5d95c2a6a807ae7579974361b/defaults/main.yml#L264

view this post on Zulip jamie jamison (Aug 05 2024 at 21:46):

Oliver Bertuch said:

I can share more details about how I used a few local containers to get the job done if that helps.

@Oliver Bertuch Yes, some details on how this is used would help.

view this post on Zulip Oliver Bertuch (Aug 06 2024 at 07:08):

Ok so here's a quick list of tasks:

  1. Make a pg_dump of your current database into a file dump.sql
  2. Copy that file to your dev machine (or wherever you have Maven and Java SDK around)
  3. Make sure you have installed Docker on that machine as well (you could also use some arbitrary Linux cloud host if that's easier to get Docker up and running)
  4. Download this compose.yaml and put it next to your dump.sql file.
  5. Run docker compose up to start a PG container exposed on port 54321
  6. Apply my POM patch to enable using the Flyway migrations locally
  7. Get all the extra migrations you need that have a version number larger than your current Dataverse version (that'll be this one and this one and put them under src/main/resources/db/extra
  8. Now checkout the Flyway migrations missing from the containerized DB: mvn flyway:info. You should see that some migrations are missing.
  9. Now apply the migrations: mvn flyway:migrate -Dflyway.outOfOrder
  10. Check migrations again, should be all applied now: mvn flyway:info
  11. Now move the migrations from the extra folder or comment out that location in the POM - the idea is to make them unavailable again, as we don't want them around for all eternity or during the next upgrade.
  12. Now let's remove these migrations to be 100% upstream compatible again: mvn flyway:repair
  13. Running mvn flyway:info again should no longer list our two extra migrations!
  14. Now make another dump of the migrated database: docker exec -it data-next-postgres pg_dump -U dataverse > dump-v6.3.sql
  15. You can load this database into some other Postgres database now and deploy Dataverse 6.3, configuring DV to use this DB.
  16. Cleanup by running docker compose down

view this post on Zulip Don Sizemore (Aug 06 2024 at 12:34):

@jamie jamison you want to clone the 5.14 branch: $ git clone -b 5.14 https://github.com/gdcc/dataverse-ansible.git 5.14

view this post on Zulip jamie jamison (Aug 06 2024 at 21:06):

Don Sizemore said:

jamie jamison you want to clone the 5.14 branch: $ git clone -b 5.14 https://github.com/gdcc/dataverse-ansible.git 5.14

Tried that and ran ansible: **

ansible-playbook --connection=local -vvv -i 5.14/inventory 5.14/dataverse.pb -e "@5.14/defaults/main.yml"
**
Unfortunately failed at step: [dataverse : install java-nnn-openjdk and other packages for RedHat/Rocky] ********task path: /home/rocky/dataverse/tasks/dataverse-prereqs.yml:63

Huge, long error message. Don't know if you want to see that.

view this post on Zulip jamie jamison (Aug 06 2024 at 21:08):

Possible java verson error: **

fatal: [localhost]: FAILED! => {

"msg": "An unhandled exception occurred while templating '{'version': 11, 'home': '/usr/lib/jvm/java-{{ java.version}}'}'. Error was a <class 'ansible.errors.AnsibleError'>, original
**

view this post on Zulip Don Sizemore (Aug 07 2024 at 00:27):

@jamie jamison sigh. that again. someone else introduced a reflexive group_var without testing the role. two quick fixes could be to either edit tasks/dataverse-prereqs.yml and hard-code 11 or better yet in group_vars for the java.home variable - either should get you going. Solr mirrors tend to remove old versions, so if it dies downloading Solr, you can give it a custom Solr download url in group_vars pointing to archive.apache.org.

view this post on Zulip jamie jamison (Aug 08 2024 at 22:17):

I finally was able to restore the test.dataverse 5.14 from an aws snapshot. I'm going to keep the notes on restring earlier versions. Might be helpful for other projects.


Last updated: Oct 30 2025 at 06:21 UTC