Stream: troubleshooting

Topic: โœ” Page not Found for all Datasets after update


view this post on Zulip Henning Timm (May 13 2024 at 14:53):

Hi there, some update troubles again :sweat_smile:

After updating to v6.2, I get a 404 error for all dataset pages. The page only reads "Page Not Found - The page you are looking for was not found.", reports 404 in the DevTools and the Payara Logs tell me "failed to retrieve version"

[#|2024-05-13T16:26:17.001+0200|WARNUNG|Payara 6.2023.8|edu.harvard.iq.dataverse.DatasetPage|_ThreadID=94;_ThreadName=http-thread-pool::jk-connector(4);_TimeMillis=1715610377001;_LevelValue=900;|
  Failed to retrieve version|#]

There seems to be a problem to connect with the S3 connection that I have not seen before. From the logs:

[#|2024-05-13T16:49:32.483+0200|WARNUNG|Payara 6.2023.8|com.amazonaws.internal.InstanceMetadataServiceResourceFetcher|_ThreadID=92;_ThreadName=http-thread-pool::jk-connector(1);_TimeMillis=1715611772483;_LevelValue=900;|
  Fail to retrieve token
com.amazonaws.SdkClientException: Failed to connect to service endpoint:
        at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100)
        at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.getToken(InstanceMetadataServiceResourceFetcher.java:91)
        at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:69)
        at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66)
[...]
Caused by: java.net.ConnectException: Verbindungsaufbau abgelehnt (Connection Refused)
        at java.base/sun.nio.ch.Net.pollConnect(Native Method)
        at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
        at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:554)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602)
        at java.base/java.net.Socket.connect(Socket.java:633)
        at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:178)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:533)
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:638)
        at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:281)
        at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:386)
        at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:408)

Has anyone seen something like this before? Nothing with our S3 settings changed and on our staging server the update did not cause such issues.

view this post on Zulip Henning Timm (May 13 2024 at 15:18):

The issue is stranger still, and probably S3 is not at fault. I definitely have access to the S3 storage and can download files. When I navigate directly to a File's page I can see and Download the file, so there seems to be a problem with the PID settings or SOLR.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 13 2024 at 15:27):

Failed to retrieve version. Huh.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 13 2024 at 15:27):

Is there more of server.log you can share?

view this post on Zulip Henning Timm (May 13 2024 at 15:39):

Nothing that really seems to be connected. I tried to rebuild the SOLR index by first detecting orphans, then clearing the index and doing a full reindex. There I got this:

[#|2024-05-13T17:19:43.062+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583062;_LevelValue=800;|
  Beginning indexStatus()|#]

[#|2024-05-13T17:19:43.063+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583063;_LevelValue=800;|
  checking for stale or missing dataverses|#]

[#|2024-05-13T17:19:43.084+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583084;_LevelValue=800;|
  checking for stale or missing datasets|#]

[#|2024-05-13T17:19:43.088+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583088;_LevelValue=800;|
  completed check for stale or missing content.|#]

[#|2024-05-13T17:19:43.088+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583088;_LevelValue=800;|
  checking for dataverses in Solr only|#]

[#|2024-05-13T17:19:43.128+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583128;_LevelValue=800;|
  checking for datasets in Solr only|#]

[#|2024-05-13T17:19:43.193+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583193;_LevelValue=800;|
  checking for files in Solr only|#]

[#|2024-05-13T17:19:43.570+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613583570;_LevelValue=800;|
  completed check for content in Solr but not database|#]

[#|2024-05-13T17:19:44.362+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613584362;_LevelValue=800;|
  checking for permissions in database but stale or missing from Solr|#]

[#|2024-05-13T17:19:44.373+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613584373;_LevelValue=800;|
  completed checking for permissions in database but stale or missing from Solr|#]

[#|2024-05-13T17:19:44.374+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613584374;_LevelValue=800;|
  contentInDatabaseButStaleInOrMissingFromIndex: {"dataverses":[],"datasets":[]}|#]

[#|2024-05-13T17:19:44.375+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613584375;_LevelValue=800;|
  contentInIndexButNotDatabase: {"dataverses":[],"datasets":[],"files":[]}|#]
[#|2024-05-13T17:19:44.375+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613584375;_LevelValue=800;|
  permissionsInDatabaseButStaleInOrMissingFromIndex: {"dvobjects":[]}|#]

[#|2024-05-13T17:19:44.375+0200|INFORMATION|Payara 6.2023.8|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=203;_ThreadName=__ejb-thread-pool7;_TimeMillis=1715613584375;_LevelValue=800;|
  permissionsInIndexButNotDatabase: {"permissions":["datafile_1000_draft_permission","datafile_1001_draft_permission","datafile_1004_draft_permission",
"datafile_1007_draft_permission","datafile_1008_draft_permission","datafile_1009_draft_permission",
"datafile_1012_draft_permission","datafile_1013_draft_permission","datafile_1015_draft_permission",
"datafile_1016_draft_permission","datafile_1019_draft_permission","datafile_1020_draft_permission",
"datafile_1021_draft_permission","datafile_1022_draft_permission","datafile_1023_draft_permission","datafile_1024_draft_permission","datafile_1027_draft_permission","datafile_1030_draft_permission","datafile_1031_draft_permission","datafile_1032_draft_permission","datafile_1034_draft_permission","datafile_1037_draft_permission","datafile_1038_draft_permission","datafile_1039_draft_permission","datafile_1041_draft_permission","datafile_1044_draft_permission","datafile_1046_draft_permission","datafile_1049_draft_permission","datafile_1051_draft_permission","datafile_1055_draft_permission","datafile_1057_draft_permission","datafile_1060_draft_permission","datafile_1061_draft_permission","datafile_1067_draft_permission","datafile_1068_draft_permission","datafile_1070_draft_permission","datafile_1073_draft_permission","datafile_1079_draft_permission","datafile_1088_draft_permission","datafile_1091_draft_permission","datafile_1092_draft_permission","datafile_1094_draft_permission","datafile_1097_draft_permission","datafile_1098_draft_permission","datafile_1102_draft_permission","datafile_1104_draft_permission","datafile_1108_draft_permission","datafile_1109_draft_permission","datafile_1114_draft_permission","datafile_1115_draft_permission","datafile_1125_draft_permission","datafile_1128_draft_permission","datafile_1137_draft_permission","datafile_1139_draft_permission","datafile_1140_draft_permission","datafile_1143_draft_permission","datafile_1146_draft_permission","datafile_1147_draft_permission","datafile_1150_draft_permission","datafile_1153_draft_permission","datafile_1154_draft_permission","datafile_1159_draft_permission","datafile_1161_draft_permission","datafile_1165_draft_permission","datafile_1171_draft_permission","datafile_1177_draft_permission","datafile_1178_draft_permission","datafile_1179_draft_permission","datafile_1182_draft_permission","datafile_1184_draft_permission","datafile_1186_draft_permission","datafile_1188_draft_permission","datafile_1192_draft_permission","datafile_1194_draft_permission","datafile_1195_draft_permission","datafile_1197_draft_permission","datafile_1200_draft_permission","datafile_1201_draft_permission","datafile_1202_draft_permission","datafile_1204_draft_permission","datafile_1205_draft_permission","datafile_1210_draft_permission","datafile_1214_draft_permission","datafile_1217_draft_permission","datafile_1219_draft_permission","datafile_1222_draft_permission","datafile_1223_draft_permission","datafile_1226_draft_permission","datafile_1227_draft_permission","datafile_1230_draft_permission","datafile_1231_draft_permission","datafile_1232_draft_permission","datafile_1236_draft_permission","datafile_1249_draft_permission","datafile_1250_draft_permission","datafile_1252_draft_permission","datafile_1259_draft_permission","datafile_1260_draft_permission","datafile_1262_draft_permission","datafile_1267_draft_permission","datafile_1269_draft_permission","datafile_1270_draft_permission","datafile_1277_draft_permission","datafile_1278_draft_permission","datafile_1281_draft_permission","datafile_1282_draft_permission","datafile_1285_draft_permission","datafile_1293_draft_permission","datafile_1302_draft_permission","datafile_1303_draft_permission","datafile_1304_draft_permission","datafile_1311_draft_permission","datafile_1317_draft_permission","datafile_1319_draft_permission","datafile_1321_draft_permission","datafile_1325_draft_permission","datafile_1326_draft_permission","datafile_1327_draft_permission","datafile_1331_draft_permission","datafile_1335_draft_permission","datafile_1336_draft_permission","datafile_1338_draft_permission","datafile_1342_draft_permission","datafile_1351_draft_permission","datafile_1352_draft_permission","datafile_1353_draft_permission","datafile_1359_draft_permission","datafile_1366_draft_permission","datafile_1368_draft_permission","datafile_1371_draft_permission","datafile_1372_draft_permission","datafile_1374_draft_permission","datafile_1376_draft_permission","datafile_1377_draft_permission","datafile_1378_draft_permission","datafile_1381_draft_permission","datafile_1382_draft_permission","datafile_1389_draft_permission","datafile_1391_draft_permission","datafile_1406_draft_permission","datafile_1413_draft_permission","datafile_1418_draft_permission","datafile_1422_draft_permission","datafile_1424_draft_permission","datafile_1425_draft_permission","datafile_1430_draft_permission","datafile_1432_draft_permission","datafile_1438_draft_permission","datafile_1439_draft_permission","datafile_1444_draft_permission","datafile_1449_draft_permission","datafile_1454_draft_permission","datafile_1456_draft_permission","datafile_1464_draft_permission","datafile_1466_draft_permission","datafile_1475_draft_permission","datafile_1479_draft_permission","datafile_1480_draft_permission","datafile_1486_draft_permission","datafile_1487_draft_permission","datafile_1489_draft_permission","datafile_1491_draft_permission","datafile_1498_draft_permission","datafile_1500_draft_permission","datafile_1502_draft_permission","datafile_1505_draft_permission","datafile_1513_draft_permission","datafile_1514_draft_permission","datafile_1520_draft_permission","datafile_1521_draft_permission","datafile_1523_draft_permission","datafile_1528_draft_permission","datafile_1533_draft_permission","datafile_1534_draft_permission","datafile_1544_draft_permission","datafile_1551_draft_permission","datafile_975_draft_permission","datafile_980_draft_permission","datafile_981_draft_permission","datafile_985_draft_permission","datafile_988_draft_permission","datafile_989_draft_permission","datafile_991_draft_permission","datafile_993_draft_permission","datafile_994_draft_permission","datafile_997_draft_permission","datafile_998_draft_permission"]}|#]

view this post on Zulip Henning Timm (May 13 2024 at 15:42):

It found a lot of permissions that were in the index but not in the database, but those all seem to be for drafts.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 13 2024 at 15:46):

Can you retrieve datasets via API?

view this post on Zulip Henning Timm (May 13 2024 at 15:57):

I can find datasets via the search API, but when I try to access the dataset I get

{"status":"ERROR","message":"Dataset with Persistent ID perma:<my-PID> not found."}%

Where <py_PID> is one of the PIDs I got from the search.

view this post on Zulip Henning Timm (May 13 2024 at 16:03):

Maybe I damaged my database during the update? :thinking:

view this post on Zulip Henning Timm (May 13 2024 at 16:04):

Thank you for the help. I will need to go offline for a few hours. Its already 6pm in Germany ;)

view this post on Zulip Philip Durbin ๐Ÿš€ (May 13 2024 at 16:07):

No worries.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 13 2024 at 16:07):

"perma". Hmm.

view this post on Zulip Henning Timm (May 14 2024 at 08:55):

Some further digging: When manually querying SOLR for a dataset, all Datasets show up and all the Metadata are there. So I think the database is intact. It just seems that Dataverse does not find a page for the given PID.

When I enter a URL with a bogus PID (e.g. substitute it with fdaafdsa or replace perma: with doi:) I get an Internal Server Error, only for valid PIDs I get the "Page not Found" error.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 11:19):

Ah, I bet you need this fix: IQSS/10516 fix perma legacy support #10521

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 11:19):

Or an equivalent workaround.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 11:20):

Oh, wait... this is you: https://github.com/IQSS/dataverse/issues/10516#issuecomment-2097916267

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 11:21):

@Henning Timm perhaps you should create a new GitHub issue

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 11:21):

And link to the old one

view this post on Zulip Henning Timm (May 14 2024 at 11:59):

Will do.
I suspect some trouble with the Legacy PID system. In my staging server I had the following Database settings for PIDs:

{
  "data": {
    ":Authority": "thz",
    ":Shoulder": "-ds",

In the production server that is giving me trouble now it looked like this:

{
  "data": {
    ":Authority": "thz-",
    ":Shoulder": "ds-",

Maybe this is connected to this somehow.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 12:50):

Oh. Huh. Yeah, I suppose that could explain it. Not sure.

view this post on Zulip Henning Timm (May 14 2024 at 13:20):

A look into the database supports that this might be the issue. In the database, the authority is (correctly) listed as thz-, when I create a new dataset it is just thz since the - is now the separator. I am not 100% sure how the split is performed, but I could imagine that the parsing function (https://github.com/IQSS/dataverse/blob/dae5ca7dc46c1272d65b36688f6fded224e7288c/src/main/java/edu/harvard/iq/dataverse/pidproviders/AbstractPidProvider.java#L291) is truncating the separator, so the DB is queries with "thz" when it should be queried with "thz-".

view this post on Zulip Henning Timm (May 14 2024 at 13:21):

I could probably work around that by using another separator for the new PID provider.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 14 2024 at 13:54):

Sounds good. If you need a hand, please just make some noise.

Also, would you consider this a bug? If so, please feel free to create an issue.

view this post on Zulip Henning Timm (May 15 2024 at 10:05):

It's running again! :tada:
Using the separator "/" instead of "-" when setting up the new permalink provider solved this.

view this post on Zulip Henning Timm (May 15 2024 at 10:05):

Thanks again for all the help!

view this post on Zulip Henning Timm (May 15 2024 at 10:07):

I think I would consider this a bug and create an issue describing this. We were a bit flustered here how exactly the parsing of PIDs works, so we set aside some time in the first week of June to build up a test environment and dig into this. I am not sure we can provide a fix, but we will certainly try ;)

view this post on Zulip Notification Bot (May 15 2024 at 10:07):

Henning Timm has marked this topic as resolved.

view this post on Zulip Philip Durbin ๐Ÿš€ (May 15 2024 at 10:51):

At the very least, perhaps some documentation can be written. Please do go ahead and create an issue, if you don't mind! I'm glad you fixed it! :tada:

view this post on Zulip Henning Timm (May 15 2024 at 16:13):

Done :) https://github.com/IQSS/dataverse/issues/10564

view this post on Zulip Philip Durbin ๐Ÿš€ (May 15 2024 at 16:31):

Thanks! :heart:


Last updated: Oct 30 2025 at 06:21 UTC