Hello everyone, how are you?
I have been trying to upgrade my Dataverse installation, from 5.14 to 6.2, but I have had a problem deploying the dataverse war file. The problem is with postgres.
I have created a new database, but the problem continues. There is an image showing the problem.
Does anyone know what I should do?
I downloaded the war file from https://github.com/IQSS/dataverse/releases/tag/v6.2
Hi! First, you have to upgrade to 6.0, then 6.1, then 6.2.
Second, can you please show us server.log?
@Philip Durbin thanks for answer, so, I'm gonna upgrade version by version.
This is the log file.
Error_Dataverse6.2.txt
I don't understand why I have this postgres error
Did you do this step from https://github.com/IQSS/dataverse/releases/tag/v6.0 ?
sudo -u dataverse /usr/local/payara6/bin/asadmin create-jvm-options --add-opens=java.base/java.io=ALL-UNNAMED
I'm asking because in your log I see module java.base does not "opens java.io" to unnamed module and I'm reminded of #10068.
Hi @Philip Durbin I have decided to install the new dataverse version from scratch. Thank goodness I didn't have any information loaded yet.
Sounds easier!
I noticed Harvard is at 6.2, but this version: v. 6.2 build "v6.2+10451+10463+10383-iqss"
Any special reason why this not "the" 6.2 release?
UVa is upgrading from 5.14 to 6.2 next week and am wondering about running into problems. There seem to be a few folks have encountered (from the Dataverse google group and here on Zulip).
As you might guess, #10451 #10463 and #10383 are all pull requests:
I guess these are all fixes we needed for Harvard Dataverse. They will all be included in 6.3.
I'm finally able to list buckets via aws cli. ย The production bucket policy doesn't have an explicate line for listing the bucket and doesn't seem to need it.ย But I aded the explict list ("Action": "s3:ListBucket",) to dataverse-test-oregon and suddenly I could list it with aws cli.
I'm not sure what the difference is but will do this for the new production system.
Back to shibboleth and doi setup correctly in domain.xml.
s3 bucket issues. The production bucket policy (RHEL 7) doesn't have an explicate line for listing the bucket and doesn't seem to need it.ย But I aded the explict list ("Action": "s3:ListBucket",) to dataverse-test-oregon and suddenly I could list it with aws cli. I don't know if this is a difference between the 5.14 and 6.2 but I'm going to follow this example for the production 6.2.
Hmm, it this something we should fix in our guides?
I haven't installed a clean system since 4.11 or there about. It seems that what worked up to 5.14 is different in 6.2. Bit of a jump between versions so I may have missed something. Examples for the pid/doi settings might be helpful. I'll post mine when I get through this install.
Sorry, I just realized I posted in the wrong place. I'm limping through 5.14 to 6.2
My new errors are actually helpful. Things I need to correct in domain.xml.
[#|2024-06-11T02:46:25.683+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985683;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/QF48PD (getDataAccessObject: Unsupported storage method.)|#]
[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
Could not find storage driver for: s3|#]
[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/MN0MUZ (getDataAccessObject: Unsupported storage method.)|#]
[#|2024-06-11T02:46:25.693+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985693;_LevelValue=900;|
Could not find storage driver for: s3|#]
[#|2024-06-11T02:46:25.693+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985693;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/KHOWY6 (getDataAccessObject: Ulogs/server.log
jamie jamison said:
Sorry, I just realized I posted in the wrong place. I'm limping through 5.14 to 6.2
No worries, I moved the messages.
I'm not sure what some of the domain.xml settings should be so I'm working it out via trail and error.
documentation question (https://guides.dataverse.org/en/latest/installation/config.html#second-configure-your-dataverse-installation-to-use-s3-storage):
example: dataverse.files.storage-driver-id <id> Enable <id> as the defaultย storageย driver. file (<- default)
Since all my storage is in s3 buckets, should this be: Ddataverse.files.storage-driver-id=s3
As long as your id is "s3", yes. If your id is "foobar" it should be "foobar".
I'm trying to track down storage errors:
[#|2024-06-11T02:46:25.683+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985683;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/QF48PD (getDataAccessObject: Unsupported storage method.)|#]
[#|2024-06-11T02:46:25.688+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1718073985688;_LevelValue=900;|
Could not find storage driver for: s3|#]
@Philip Durbin @jamie jamison Dataverse doesn't need listBucket but the CLI does, if you want to pull a listing.
Perhaps inaccurately I was using aws cli to check if the buckets were accessable.
Right now I'm stuck at the point where payara6 is running but dataverse pages aren't loading.
I managed to break the test system to the point it no longer loads. Since I thought it was idempotent I tried running the ansible script again. I stopped and disabled payara6, ran the script and got this message:
ASK [dataverse : fire off installer] **********************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "/usr/bin/python3 /tmp/dvinstall/install.py -f --config_file=default.config --noninteractive > /tmp/dvinstall/install.out 2>&1", "delta": "0:01:26.506903", "end": "2024-06-13 22:32:09.208629", "msg": "non-zero return code", "rc": 1, "start": "2024-06-13 22:30:42.701726", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
So I realize now it's not idempotent. I can rebuild from scratch if necessary, had some practice. But I'm wondering if dropping the dataverse database table would help?
To rerun the ansible script: 1) stop and disable payara6, 2) drop (and probably dump) the dvndb, 3) probably doesn't hurt to delete the domain.xml file and then rerun the ansible script.
Go go go!
Well that was fun while it lasted.ย Now can't start or restart payara6
Anyone out there with postgres experience?
I was able to reload the dvndb database but I have to set (or reset) the privlidges. I 'm trying to figure out what '=Tc/postgres postgres=CTc/postgres dvnuser=CTc/postgres ' so I can add that to dvndb.
=Tc/postgres? Where are you seeing that?
I log into postgres and display the tables
dvndb | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
dvndb_bkup | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =Tc/postgres +
| | | | | postgres=CTc/postgres+
| | | | | dvnuser=CTc/postgres
dvndp | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(6 rows)
And yes, dvndp is a mispelling error
Just found Create, connect and temporary (stackexchange)
After running ansible it seems you have do drop or rename that initial dvndb, make a new dvndb and then reload, and the add privlidges. (By the way did I mention that I lost my sys admin support a week before RHEL7 end-of-life, hense the frantic googling to rebuild on a new system.)
I see, like this:
dvndb | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
dvndb_bkup | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =Tc/postgres +
| | | | | postgres=CTc/postgres+
| | | | | dvnuser=CTc/postgres
dvndp | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(6 rows)
I'm using fenced code blocks above. Very useful here in Zulip and in GitHub. You could try this if you want:
```
test
```
But yeah, I'm not familiar with =Tc/postgres.
Check this out: https://www.postgresql.org/docs/current/ddl-priv.html#PRIVILEGE-ABBREVS-TABLE
and https://dba.stackexchange.com/questions/264441/is-my-database-secure-what-does-tc-postgres-allow/264445#264445 seem to say the same thing, that "T" is for TEMPORARY and "c" is for CONNECT.
@jamie jamison IIRC postgres ownership and privileges may need to be modified table-by-table
@jamie jamison if you dump and import as Phil suggests you can use the -O (no "owner") flag which may greatly simplify the process?
That makes rebuilding the database a bit more daunting I'll go back and try that - dump from the previous test system and bring over to new.
Dropped the database and reloaded from a dump. It now matches up to what the old server postgres looked like but still not getting the dataverse front page. May take waiting longer
Anything at the equivalent of https://demo.dataverse.org/api/info/version ?
I'll try that as soon as I can get payara6 to start again
404 page not found, requested service not available
any errors in server.log?
But I do get the payara page
I'm digging in the server log now
[#|2024-06-14T00:19:37.128+0000|SEVERE|Payara 6.2023.8|edu.harvard.iq.dataverse.mydata.DataRetrieverAPI|_ThreadID=90;_ThreadName=http-thread-pool::jk-connector(5);_TimeMillis=1718324377128;_LevelValue=1000;|
Sorry, nothing was found for these roles: Admin, File Downloader, Dataverse + Dataset Creator, Dataverse Creator, Dataset Creator, Contributor, Curator, Member|#]
Looks like the database is messed up somehow
Huh. Yeah, sounds like it. :disappointed:
There has to be a way to reload or restore a database
I mean, a long time ago I used scripts like this to dump and restore:
10 years ago according to git. But something like that should work, I would think.
that looks like what I did (will paste the commands in)
It looks like @Deirdre Kirmis has experience with database backup and recovery: https://groups.google.com/g/dataverse-community/c/kTEUkxHB_ZM/m/T6aRZ9SpCwAJ
Ok, this is probably not an official way to fix things but after reloading the database I reran ansible and now the test looks ok.
At that point I did as @Philip Durbin suggested and ran with "/api/info/version" and got status 'ok' and version '6.2'
Is it possible that rerunning the ansible script fixed the missing roles?
I that I'm following the directions exactly from https://guides.dataverse.org/en/latest/installation/config.html#second-configure-your-dataverse-installation-to-use-s3-storage
Example :
<jvm-options>-Ddataverse.files.storage-driver-id=file</jvm-options>
<jvm-options>-Xmx3841m</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>
If I restart and check status of payar6 I get:
sudo systemctl status payara6
ร payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Mon 2024-06-17 17:07:10 UTC; 8min ago
Process: 43806 ExecStart=/usr/bin/java -jar /usr/local/payara6/glassfish/lib/client/appserver-cli.jar start-domain (code=exited, status=1/FAILURE)
CPU: 2.396s
Jun 17 17:07:08 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal java[43806]: Port 7676 is in use
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal java[43806]: Command start-domain failed.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, status=1/FAILURE
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'exit-code'.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jun 17 17:07:10 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 2.396s CPU time.
[rocky@ip-172-31-23-165 ~]$
And, oddly the test system looks like it's up, though with no buckets until I reboot the system then I get a 500
And another question. When it's up I'm not able to log into the test dataverse with the default admin account. In the ansible script it looks like the admin account is 'dataverseAdmin' and thepassword is "admin1" which gets the error "username, email address, or password you entered is invalid."
And for shibboleth, following https://guides.dataverse.org/en/latest/installation/shibboleth.html#install-shibboleth-via-yum
I'm stuck at the error: Error: GPG check FAILED
is there anything more descriptive in the Payara server.log?
on Shibboleth: worst case you can set gpg_enable=0 in the repo config, but better to import the GPG key
For the payara server log part of the problem is I'm not sure what to look for. I'm partially relying on the time stamp.
For shib I'm obviously not importing the GPG correctly. I have back-to-back patron consults till noon and then will dig back in.
my shibboleth.repo is looking at
gpgkey=https://shibboleth.net/downloads/service-provider/RPMS/repomd.xml.key
https://shibboleth.net/downloads/service-provider/RPMS/cantor.repomd.xml.key
That is the same thing that I have:
**
[shibboleth]
name=Shibboleth (rockylinux9)
type=rpm-md
mirrorlist=https://shibboleth.net/cgi-bin/mirrorlist.cgi/rockylinux9
gpgcheck=1
gpgkey=https://shibboleth.net/downloads/service-provider/RPMS/repomd.xml.key
https://shibboleth.net/downloads/service-provider/RPMS/cantor.repomd.xml.key
enabled=1
**
One thing I am running into is that out-of-the-box I can't log into Dataverse. I've tried to unblock admin in the main.yml script but that didn't help.
Is there a way via the api to change the dataverseAdmin password? I'm reading through the documenation but havent found that yet
@jamie jamison if you're using Dataverse-Ansible you can set it as a group_var?
Ok, I'll look into that. I did try setting the dataverseAdmin password in the ansible script.
Other thing: even before I try to add s3 buckets I get this:
exception
jakarta.servlet.ServletException: /dataset.xhtml @595,164 rendered="#{settingsWrapper.makeDataCountDisplayEnabled and DatasetPage.doi}": The class 'edu.harvard.iq.dataverse.DatasetPage' does not have the property 'doi'.
root cause
jakarta.el.PropertyNotFoundException: /dataset.xhtml @595,164 rendered="#{settingsWrapper.makeDataCountDisplayEnabled and DatasetPage.doi}": The class 'edu.harvard.iq.dataverse.DatasetPage' does not have the property 'doi'.
I'm wondering where makeDataCount is enabled - trying to find in main.yml
@jamie jamison I don't think Ansible knows about MDC yet?
Ok, mostly I'm trying to understand the error messages - lot of this is new to me so I'm still limping up the learning curve
Since I'm using a reloaded dvndb database (from the REHL7 test dataverse) I'm wondering if it's possilbe that MDC is in the database and causing that error.
And I may have found something in 6.2 documentaation - https://guides.dataverse.org/en/latest/admin/make-data-count.html#enable-or-disable-display-of-make-data-count-metrics
And that seems to solve at least that error. I'm putting that in my notes - things to consider when restoring a pervious database
Actually the most exasperating issue at this point is that the default, out-of-the-box dataverseAdmin password doesn't work though this might also be because of the reloaded dvdnb database. Is that where passwords are stored?
Also, sometimes payara6 seems 'stuck'. Unable to stop or restart or even kill the process.
payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: exit-code) since Wed 2024-06-26 17:43:43 UTC; 39min ago
Process: 720 ExecStart=/usr/bin/java -jar /usr/local/payara6/glassfish/lib/client/appserver-cli.jar start-domain (code=exited, st>
CPU: 1.304s
Jun 26 17:43:38 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: com.sun.enterprise.universal.xml.MiniXmlParserException: "Xml >
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: Message: elementGetText() function expects text only elment bu>
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal java[720]: Command start-domain failed.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, status=>
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'exit-code'.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jun 26 17:43:43 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 1.304s CPU time.
[rocky@ip-172-31-23-165 logs]$
@jamie jamison I've seen Payara do this on beta.dataverse.org - is there more information in server.log? on dataverseAdmin - since you have access to the DB, you could update the e-mail address associated with userid '1' in the authenticateuser table and trigger a password reset, or you could just blank the password entry?
on restoring databases / moving servers: I keep a shell script around on test machines, reset_settings.sh or some such, with a list of commonly-required curl commands to make the test server a test server when I've imported a copy of the production database
And they problem turned out to be a typo by me. Now back on track.
The current s3 bucket related error is:
6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(2);_TimeMillis=1719511203190;_LevelValue=900;|
ย ย Could not find storage driver for: s3|#]
Lastly, I can't seem to wget the reset_settings.sh file. Is there another place to get it?
About the storage driver issue, here is part of the domain.xml code:
<jvm-options>-Ddataverse.files.storage-driver-id=s3</jvm-options> <- is this correct, the default is 'file' but should it be changed to 's3' for the s3 buckets?
<jvm-options>-Xmx3841m</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.bucket-name=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>
@jamie jamison the script was just something I threw together to make a copy of my production database suitable for our test server. I'm happy to share mine with you (just some API calls to change settings). if you're all S3, then yes you want storage-driver-id=s3. it may be because your label is dataverse-test-oregon instead of s3?
This is what I'm a bit confused about. According to the documentation:
dataverse.files.<id>.label<?> Requiredย label to be shown in the UI for thisย storage default (none)
So is it correct that the label is what's shown in the UI wouldn't the lable be 'dataverse-test-oregon'?
Did you figure it out, what's shown in the UI? (I'm not sure, myself.)
we're still working on it. Fresh install with reloaded database is messier then fresh install. You guys get the notes on how we did it when done.
Great, much appreciated. Good luck!
@jamie jamison the label appears to be a friendly name, the <id> value identifies the datastore. does that make sense?
In my case probably not. I understand the label. For the test I'm using the bucket name as the friendly name. For the <id> I guess that's were I'm confused. If the default is 'file' for local files system, what is appropriate for s3 storage?
Documentation:
dataverse.files.storage-driver-id <id> Enable <id> as the defaultย storageย driver. file
@jamie jamison the <id> bit threw me as well, our is simply "s3" and suits us for now. i'll need to get more descriptive as we add datastores.
Ok. Right now dataverse 6.2 is dead in the water becuase it can't access the s3 buckets and 5.14 is giving me grief since the operating system is end-of-life.
@jamie jamison remind me of the error you get accessing the buckets from 6.2?
jamie jamison:ย And they problem turned out to be a typo by me. Now back on track.
The current s3 bucket related error is:
6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(2);_TimeMillis=1719511203190;_LevelValue=900;|
ย ย Could not find storage driver for: s3|#]
Lastly, I can't seem to wget theย reset_settings.shย file. Is there another place to get it?
jamie jamison:ย About the storage driver issue, here is part of the domain.xml code:
<jvm-options>-Ddataverse.files.storage-driver-id=s3</jvm-options> <- is this correct, the default is 'file' but should it be changed to 's3' for the s3 buckets?
<jvm-options>-Xmx3841m</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.label=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.bucket-name=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-orego
n.upload-redirect=true</jvm-options>
I'm glad fixing the typo is helping!
Sorry, a reset settings script? I'm not familiar with it. You're seeing this in the guides or a release note somewhere?
reset settings script was a suggestion from Don. Not in the guides or release notes.
So far still can't access the s3 buckets and unable to get letsencrypt working.
Oh, I understand now, thanks
@jamie jamison in this case your storage-driver-id for s3 is dataverse-test-oregon
the reset settings script is just a handful of curl commands I threw together in a script to set the Authority of our test server instead of our production, use the throw-away shoulder, and put up a banner warning that the test server is just a test server. it's really a way to automate the stuff I had to manually fix each time I imported a copy of our production database.
I guess I didn't completely understand the documentation. Right now test is down due while trying to get letsencrypt working. But as soon as it's back I'll try that. Clarification helpful
This is aside from the installation problem. Since Dataverse 5.14 on RedHat7 is barely working or staying up I've found I can still publish some of the user datasets with the API - so they can get DOIs for publishing. API works great.
I anyone else having issues with 'stuck' payara6. Service can't be stopped, restarted or killed.
sudo systemctl status payara6
ร payara6.service - Payara Server
Loaded: loaded (/etc/systemd/system/payara6.service; enabled; preset: disabled)
Active: failed (Result: timeout) since Tue 2024-07-09 17:49:42 UTC; 2 days ago
CPU: 28.414s
Jul 09 17:47:41 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Starting Payara Server...
Jul 09 17:49:41 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: start operation timed out. Terminating.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal java[728]: Waiting for domain1 to start .............................>
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Control process exited, code=exited, sta>
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Failed with result 'timeout'.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: Failed to start Payara Server.
Jul 09 17:49:42 ip-172-31-23-165.us-west-2.compute.internal systemd[1]: payara6.service: Consumed 28.414s CPU time.
lines 1-12/12 (END)
Don Sizemore said:
jamie jamison in this case your storage-driver-id for s3 is
dataverse-test-oregon
I think I need one more clarification. If there are 4 s3 buckets, bucket1, bucket2, bucket3, bucket4. Does each one have to have it's own storage-driver-id?
dataverse.files.bucket1-storage-driver-id=bucket1
dataverse.files.bucket2-storage-driver-id=bucket2
dataverse.files.bucket3-storage-driver-id=bucket3
dataverse.files.bucket4-storage-driver-id=bucket4
These options are all mixed up. The <id> part needs to be a unique indentifier for your storage, the storage-driver-id option (mind the ".") references the driver to use for storage <id>.
Here's a complete example of an S3 storage from our beta env:
dataverse.files.s3.bucket-name=juelich_data_beta
dataverse.files.s3.custom-endpoint-url=https://s3.fz-juelich.de
dataverse.files.s3.label=Jรผlich-DATA-Object-Store
dataverse.files.s3.path-style-access=true
dataverse.files.s3.type=s3
dataverse.files.storage-driver-id=s3
So in your case, this needs to be (assuming all of them are S3):
dataverse.files.bucket1.storage-driver-id=s3
dataverse.files.bucket1.s3.xxx
...
dataverse.files.bucket2.storage-driver-id=s3
dataverse.files.bucket2.s3.xxx
...
dataverse.files.bucket3.storage-driver-id=s3
dataverse.files.bucket3.s3.xxx
...
dataverse.files.bucket4.storage-driver-id=s3
dataverse.files.bucket4.s3.xxx
...
The docs say "dataverse.files.storage-driver-id" allows you to "Enable <id> as the default storage driver." -- https://guides.dataverse.org/en/6.3/installation/config.html#list-of-s3-storage-options
So you'll want to pick one of your four buckets as the <id>, like this:
dataverse.files.storage-driver-id=bucket3
Dang you're right! I mixed that up myself! See, it's fricking complicated!
One day I will make the MPCONFIG part for the storage system and take care to rename the default driver option!
So friggin' complicated. Need more docs, I guess, short term.
I'll make an example along with the ntoes I send you. Last question to clarify. You only need 1 default-driver-id? Even with 2 or more s3 buckets?
Yes. That's the driver that will be assigned to a new collection if you don't specify a different one.
There's only 1 instance wide default :-)
Thank you for clarifying. I'll send over my example as soon as I get finished setting up the new server.
Hereis some code following the example:
<jvm-options>-**Ddataverse.files.datavers-test-oregon.storage-driver-id=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverese-test-oregon.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon=dataverse-test-oregon</jvm-options>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.download-redirect=true</jvm-o**ptions>
<jvm-options>-Ddataverse.files.dataverse-test-oregon.upload-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.storage-driver-id=s3</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.type=s3</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files=ssda-files</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.download-redirect=true</jvm-options>
<jvm-options>-Ddataverse.files.ssda-files.upload-redirect=true</jvm-options>
and I still get:
[#|2024-07-12T20:14:08.003+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248003;_LevelValue=900;|
Could not find storage driver for: s3|#]
[#|2024-07-12T20:14:08.003+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248003;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/MN0MUZ (getDataAccessObject: Unsupported storage method.)|#]
[#|2024-07-12T20:14:08.008+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.dataaccess.DataAccess|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248008;_LevelValue=900;|
Could not find storage driver for: s3|#]
[#|2024-07-12T20:14:08.009+0000|WARNING|Payara 6.2023.8|edu.harvard.iq.dataverse.ThumbnailServiceWrapper|_ThreadID=88;_ThreadName=http-thread-pool::jk-connector(3);_TimeMillis=1720815248009;_LevelValue=900;|
getDatasetCardImageAsUrl(): Failed to initialize dataset StorageIO for s3://10.5072/FK2/KHOWY6 (getDataAccessObject: Unsupported storage method.)|#]
So maybe the problem is something else? I did check the aws configuration and that looks correct.
dataverse.files.datavers-test-oregon.storage-driver-id=s3 has a typo it it. An "e" is missing from the end of "dataverse".
well that embarrassing for an old secretary... Will go retry now
Personally, I would use and <id> that doesn't have hyphens it in. But maybe they do work. Dunno.
I fixed the typo. Still get the error. The reason I formatted the way that I did was an example from Oliver Bertuch (above):
dataverse.files.bucket4.storage-driver-id=s3
and this is what's in documentation:
dataverse.files.storage-driver-id <id> Enable <id> as the defaultย storageย driver. file
I feel like there's something amiss with your config but it's late here. Maybe I can look at it with fresh eyes next week.
There's also the missing .label thing
Educated guess on the error messages: the storage driver ID is IIRC built into the location URL. So there is no storage driver called "s3" around as per the config you posted
Yupp. These storage id's need to be migrated... (DataAccess.getStorageIdFromLocation)
Or you configure a storage driver that has an id "s3" :wink:
Could you post a sample of what that looks like? I've been using this example:
dataverse.files.bucket2.storage-driver-id=s3
Related: https://github.com/IQSS/dataverse/issues/10684
@jamie jamison this one doesn't look right, for example:
<jvm-options>-Ddataverse.files.ssda-files=ssda-files</jvm-options>
ssda-files is your <id> and there should be something after it like .type or .label or whatever.
Another non-s3 question. We still having trouble with the install. I was asked what branch we should be using if we want to install dataverse 6.2 on rocky 9. On going issues with letsencrypt.
Should we use 243_rocky_9 or 6.2?
Thanks @jamie, if we would like to use the community ansible scripts for dvn 6.2 on rocky 9, what is the suggested branch to use?
https://github.com/gdcc/dataverse-ansible/tree/v6.2 or
https://github.com/gdcc/dataverse-ansible/tree/243_rocky_9
@Kristian Allen - UCLA is this a new installation? Typically that's what the Ansible scripts are used for.
Yes, we are doing a fresh install because we were unable to update RHEL7 in-place. So we are doing a fresh install of 6.2 on Rocky 9
Right, right. I don't want to tell you the wrong thing. Better wait for @Don Sizemore
yes @jamie jamison is right -- clean install. We've been using v6.2 branch but been having a few issues , we'll submit some PRs once things settle down and we get things working.
The current blocker is multiple processes grabbing the same port:
Following the readme, we updated the main.yml
apache:
enabled: true
public_fqdn:
ssl:
enabled: false
remote_cert: false
port: 443
cert:
interm:
key:
pem:
cert:
key:
interm:
port: 80
But is 443 or 80 redefined anywhere, like in dataverse-apache.yml?
- name: certbot bonks on listen 443
lineinfile:
path: '{{ apache_virtualhost_dir }}/http.proxy.conf'
regexp: '^Listen 443 https'
state: absent
when: letsencrypt.enabled == true
@Kristian Allen - UCLA the Apache proxy bits of Dataverse-Ansible are very, very untested. pull requests to fix problems or specific issue descriptions are most welcome.
Well we're definately testing them at ucla
@Don Sizemore We're wondering if we are using the correct branch - 243_rocky_9 or 6.2. So I don't make a pull request for the wrong branch.
@jamie jamison I could attempt to merge the 6.2 branch with Rocky9, if you like? I'm happy to schedule a meeting. The httpd portion of Dataverse-Ansible is entirely untested. LetsEncrypt refuses to issue certs for addresses ending in amazon.com, so that code is theoretical.
I'm going to wait for Tim and Kristian but wondering if I should just do the fresh install with 6.0 or 6.1 and then do conventional updates?
How are most people doing a fresh install?
Don Sizemore do you think it might help @jamie jamison and @Kristian Allen - UCLA using the way I'm upgrading with migrations and all?
Since the ansible script for 6.2 doesn't seem to be working I'm happy to hear suggestions
I can't upgrade from 5.14 because RHEL7 hit endoflife. Have to do a fresh install (on rocky9)
It's the letsencrypt part that fails. We're going to try yet again and install lets encrypt after the fact. But I'd still like to know how other people are doing fresh installs.
Successfully, that is.
Have you tried disabling the Let's Encrypt part from the DV Ansible? And then using another role that has been tested working on RHEL 9 or doing it manually?
Just to get stuff going...
In the 6.2 ansible script, letsencrypt out-of-the-box is disabled. We're going to test again on a fresh Rocky 9 and see how that goes.
@Don Sizemore Would you have any time to meet on a friday?
@Oliver Bertuch Dataverse-Ansible works with RHEL9, but IIRC that (finally) got merged after the 6.2 branch
@jamie jamison I think I should be able to meet late Friday morning or in the afternoon? You'll appreciate that a test server here started inexplicably receiving 403 Forbiddens from AWS S3. No config changes.
@jamie jamison on your 403 Forbiddens: did you update your CORS rules per the CORS section of https://github.com/IQSS/dataverse/releases/v5.12 ?
We're still back at getting letsencrypt working but I believe I did since we were at 5.14
We're in a different time zone but morning or afternoon is fine. Are you on EST?
12:30 or 1pm our time? I'll figure out pst to your time
we're EDT time but I'm getting 403 Forbiddens on a test server, so I'm interesting in comparing notes there.
What's a good time for you?
@jamie jamison I have a 9:45 EDT with Sonia, but other than that I'm free. On the 403 forbiddens: how are you defining your S3 credentials?
Dealing with the letsencrypt and certbot somewhat distracted me from the s3 buckets. I'm getting back to that today
I should add that I need a working system to deal with the s3 buckets and at this time I don't have that. I'm wondering how other people do their installations or migration installations. Ansible script or the manual installl.
I have a working test dataverse and some worI hking s3 buckets. I'm using Oliver Oliver Bertuch's example:
dataverse.files.bucket1.storage-driver-id=s3
dataverse.files.bucket1.s3.xxx
...
dataverse.files.bucket2.storage-driver-id=s3
dataverse.files.bucket2.s3.xxx
...
dataverse.files.bucket3.storage-driver-id=s3
dataverse.files.bucket3.s3.xxx
...
dataverse.files.bucket4.storage-driver-id=s3
dataverse.files.bucket4.s3.xxx
That each bucket needs to have it's own storage-id
It still looks a little off to me but if it's working, great!
The above is example is from @Oliver Bertuch . I guess what I had been trying to verify is if every bucket needs it's own storage-driver - it they are all the same storage type.
Our current theory is to do a fresh install of 6.3 via Ansible (this is working), then migrate the database from 5.14 to 6.3 via Flyway. Does this make sense or does anyone see any areas to be aware of?
The thinking was https://github.com/IQSS/dataverse/blob/0d279573bb8d7c96d7a4a1dc4b66b2258059dfba/src/main/java/edu/harvard/iq/dataverse/flyway/StartupFlywayMigrator.java#L16 can be run on own?
Or is there a cleaner way to do this via pulling objects out and putting back in via DVN API?
I mean, people have pulled off similar feats but officially we ask that you step through each upgrade. (We know this is painful.)
I don't think the step through upgrade path would work in our case as our 5.14 instance is on older version of RHEL and we have to move to a new OS (Rocky), so we were trying to determine a path forward
I think we'll power forward with the manual migration but keep meticulous notes and then we'll make sure to write up for the community
@Kristian Allen - UCLA I personally think that your new server is a great spot to migrate/upgrade from 5.14 to 6.0, then carry on to 6.3.
@Kristian Allen - UCLA you might be interested in this "Our experience on upgrading dataverse from 4.2.2 to 6.1 in a non-standard way" thread: https://groups.google.com/g/dataverse-community/c/BUO37-reWIs/m/18SK0DZlAQAJ
I have been doing experiments on a similar migration from 4.20 to 6.3 using Flyway only as well.
You can find the necessary extra migrations you will need when using Flyway only here: https://jugit.fz-juelich.de/fdm/dev/dataverse/-/tree/juelich-data-upgrade/src/main/resources/db/extra?ref_type=heads
Obviously, you'll only need the 5.14+ migrations, not the others.
I've also added the Flyway Maven Plugin in this branch to the DV app POM. https://jugit.fz-juelich.de/fdm/dev/dataverse/-/blob/juelich-data-upgrade/pom.xml?ref_type=heads#L755
I can share more details about how I used a few local containers to get the job done if that helps.
Don Sizemore said:
Kristian Allen - UCLA I personally think that your new server is a great spot to migrate/upgrade from 5.14 to 6.0, then carry on to 6.3.
@Don Sizemore I'm trying to install a fresh test 5.14 Directions say "may be installed using branches tagged with that version". What is the correct syntax to tag the version?
Would this look like: **
git clone https://github.com/GlobalDataverseCommunityConsortium/dataverse-ansible.git dataverse-5.14
**
I might get this wrong, but I believe it refers to the tagging on the Dataverse side.
I think you'd change this from "version:6.3" to whatever version you want but I'm not sure if it "just works": https://github.com/gdcc/dataverse-ansible/blob/c1e21f255dd00bf5d95c2a6a807ae7579974361b/defaults/main.yml#L264
Oliver Bertuch said:
I can share more details about how I used a few local containers to get the job done if that helps.
@Oliver Bertuch Yes, some details on how this is used would help.
Ok so here's a quick list of tasks:
pg_dump of your current database into a file dump.sqldump.sql file.docker compose up to start a PG container exposed on port 54321src/main/resources/db/extramvn flyway:info. You should see that some migrations are missing.mvn flyway:migrate -Dflyway.outOfOrdermvn flyway:infoextra folder or comment out that location in the POM - the idea is to make them unavailable again, as we don't want them around for all eternity or during the next upgrade.mvn flyway:repairmvn flyway:info again should no longer list our two extra migrations!docker exec -it data-next-postgres pg_dump -U dataverse > dump-v6.3.sqldocker compose down@jamie jamison you want to clone the 5.14 branch: $ git clone -b 5.14 https://github.com/gdcc/dataverse-ansible.git 5.14
Don Sizemore said:
jamie jamison you want to clone the 5.14 branch:
$ git clone -b 5.14 https://github.com/gdcc/dataverse-ansible.git 5.14
Tried that and ran ansible: **
ansible-playbook --connection=local -vvv -i 5.14/inventory 5.14/dataverse.pb -e "@5.14/defaults/main.yml"
**
Unfortunately failed at step: [dataverse : install java-nnn-openjdk and other packages for RedHat/Rocky] ********task path: /home/rocky/dataverse/tasks/dataverse-prereqs.yml:63
Huge, long error message. Don't know if you want to see that.
Possible java verson error: **
fatal: [localhost]: FAILED! => {
"msg": "An unhandled exception occurred while templating '{'version': 11, 'home': '/usr/lib/jvm/java-{{ java.version}}'}'. Error was a <class 'ansible.errors.AnsibleError'>, original
**
@jamie jamison sigh. that again. someone else introduced a reflexive group_var without testing the role. two quick fixes could be to either edit tasks/dataverse-prereqs.yml and hard-code 11 or better yet in group_vars for the java.home variable - either should get you going. Solr mirrors tend to remove old versions, so if it dies downloading Solr, you can give it a custom Solr download url in group_vars pointing to archive.apache.org.
I finally was able to restore the test.dataverse 5.14 from an aws snapshot. I'm going to keep the notes on restring earlier versions. Might be helpful for other projects.
Last updated: Oct 30 2025 at 06:21 UTC