issue with solr indexing messages · troubleshooting

Hello .. I've noticed lately that when I initiate a reindex in all three of our Dataverse installations, the following is occurring:

When I type " curl http://localhost:8080/api/admin/index/clear" .. the log shows "attempting to delete all Solr documents before a complete re-index" and the console shows "{"status":"OK","data":{"numRowsClearedByClearAllIndexTimes":3,"message":"Solr index and database index timestamps cleared."}}"

.. then, if I type "curl http://localhost:8080/api/admin/index" .. the console shows "{"status":"OK","data":{"availablePartitionIds":[0],"args":{"numPartitions":1,"partitionIdToProcess":0},"message":"indexAllOrSubset has begun of 1 dataverses and 1 datasets."}}" .... and the log goes through the indexing of all the dataverses and datasets (in this case there is only one) .. and then shows this message "1 dataverses and 1 datasets indexed. index all took 249 milliseconds. Solr index was not cleared before indexing."

I did find a few issues related to this, but they were from 7 years ago. There were some similar ones more recently, but in those cases there was an exception error and failures of some of the datasets indexing. I did reinstall solr and this particular installation is brand new. And then created one dataset. =)

We are running v6.6 on our dev site (the one above with 1 dataset), which was a complete new install. QA and prod are running v6.5 and both experience the same thing, QA has 0 datasets currently and prod has almost 100.

.. all of the "checks" messages in the logs seem normal when i check the status or timestamps ..

Philip Durbin 🚀 (Apr 07 2025 at 19:31):

Deirdre Kirmis (Apr 07 2025 at 19:32):

no nothing .. the "index was not cleared is the last message" .. until i check the status and then i get all the normal status messages

Deirdre Kirmis (Apr 07 2025 at 19:36):

Philip Durbin 🚀 (Apr 07 2025 at 19:38):

This is what I see (I'm using Docker and only have the root collection and running the latest in the "develop" branch):

dev_dataverse>   indexing dataverse 1 of 1 (id=1, persistentId=root)|#]
dev_dataverse>
dev_dataverse> [#|2025-04-07T19:37:53.336+0000|INFO|Payara 6.2025.2|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=254;_ThreadName=__ejb-thread-pool2;_TimeMillis=1744054673336;_LevelValue=800;|
dev_dataverse>   done iterating through all datasets|#]
dev_dataverse>
dev_dataverse> [#|2025-04-07T19:37:53.336+0000|INFO|Payara 6.2025.2|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=254;_ThreadName=__ejb-thread-pool2;_TimeMillis=1744054673336;_LevelValue=800;|
dev_dataverse>   index all took 5 milliseconds|#]
dev_dataverse>
dev_dataverse> [#|2025-04-07T19:37:53.337+0000|INFO|Payara 6.2025.2|edu.harvard.iq.dataverse.search.IndexBatchServiceBean|_ThreadID=254;_ThreadName=__ejb-thread-pool2;_TimeMillis=1744054673337;_LevelValue=800;|
dev_dataverse>   1 dataverses and 0 datasets indexed. index all took 5 milliseconds. Solr index was not cleared before indexing.
dev_dataverse> |#]

Philip Durbin 🚀 (Apr 07 2025 at 19:39):

Deirdre Kirmis (Apr 07 2025 at 19:39):

.. lol well increasing logging level gave a LOT of detail for each indexed dataset, but same result on the "not cleared" message .. that is the last message

Philip Durbin 🚀 (Apr 07 2025 at 19:40):

Deirdre Kirmis (Apr 07 2025 at 19:40):

oh sorry .. i looked at yours and for some reason thought it said it was cleared

Deirdre Kirmis (Apr 07 2025 at 19:41):

Philip Durbin 🚀 (Apr 07 2025 at 19:41):

When you say the console doesn't come back... you mean it hangs? It should spit out some JSON and give you your prompt back.

Deirdre Kirmis (Apr 07 2025 at 19:41):

yes, on the console it gives the "index has begun" message, but never comes back and says it was complete

Deirdre Kirmis (Apr 07 2025 at 19:42):

curl http://localhost:8080/api/admin/index
{"status":"OK","data":{"availablePartitionIds":[0],"args":{"numPartitions":1,"partitionIdToProcess":0},"message":"indexAllOrSubset has begun of 1 dataverses and 1 datasets."}}

Stream: troubleshooting

Topic: issue with solr indexing messages

Deirdre Kirmis (Apr 07 2025 at 19:28):

Philip Durbin 🚀 (Apr 07 2025 at 19:31):

Deirdre Kirmis (Apr 07 2025 at 19:32):

Deirdre Kirmis (Apr 07 2025 at 19:36):

Philip Durbin 🚀 (Apr 07 2025 at 19:38):

Philip Durbin 🚀 (Apr 07 2025 at 19:39):

Deirdre Kirmis (Apr 07 2025 at 19:39):

Philip Durbin 🚀 (Apr 07 2025 at 19:40):

Deirdre Kirmis (Apr 07 2025 at 19:40):

Deirdre Kirmis (Apr 07 2025 at 19:41):

Philip Durbin 🚀 (Apr 07 2025 at 19:41):

Deirdre Kirmis (Apr 07 2025 at 19:41):

Deirdre Kirmis (Apr 07 2025 at 19:42):

Deirdre Kirmis (Apr 07 2025 at 19:44):

Deirdre Kirmis (Apr 07 2025 at 19:44):

Philip Durbin 🚀 (Apr 07 2025 at 19:45):