Hi :)
I'm experiencing issues with the from= parameter when doing a partial harvest.
Most of my repositories expect the format YYYY-MM-DDThh:mm:ssZ (which Dataverse sends) but a few expect the format YYYY-MM-DD.
And I have one that expects a real timestamp (in seconds).
Errors look like
<error code="badArgument">From must be a datestamp</error>
<error code="badArgument">from: Invalid date & time</error>
<error code="badArgument" />
<error code="badArgument">The request includes illegal arguments, is missing required arguments, includes a repeated argument, or values for arguments have an illegal syntax.</error>
Ex:
https://cds.unistra.fr/registry/?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z
https://cds.unistra.fr/registry/?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07
Did you experienced it ?
Should we add an harvesting client configuration to specify the format to send ?
One related post : https://groups.google.com/g/dataverse-community/c/ceXljDp2uTw/m/1hHX3taGAQAJ
Sorry, I'm having a little trouble following this. The problem occurs when Dataverse is asking as a harvesting server, right?
Yes.
First run is good, next runs will add the from= parameter
To pick up any changes. Got it. Sounds like a bug. Or some ambiguity in the spec? Does the spec have anything to say about timestamp vs datestamp?
Both seems to be OK and most of my repositories work with both.
But I didn't find the answer.
Ok, can you please create an issue?
Do you feel like Dataverse should handle it ? or I should tell to the repositories that they are not permissive enough ?
Well, I'm curious what the spec says. Shouldn't the spec say who's right? :grinning:
https://www.openarchives.org/OAI/openarchivesprotocol.html#Dates
Datestamps used as values of the optional arguments from and until in the ListIdentifiers and ListRecords requests are encoded using ISO8601 and are expressed in UTC. These arguments are used to specify datestamp-based selective harvesting. These arguments support the "Complete date" and the "Complete date plus hours, minutes and seconds" granularities defined in ISO8601. The legitimate formats are YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Both arguments must have the same granularity. All repositories must support YYYY-MM-DD. A repository that supports YYYY-MM-DDThh:mm:ssZ should indicate so in the Identify response. A request by a harvester with finer granularity than that supported by a repository must produce an error.
Now we have the answer :D
I'll get back to repositories admin
luddaniel has marked this topic as resolved.
My god, I read so bad :D
All repositories must support YYYY-MM-DD. A repository that supports YYYY-MM-DDThh:mm:ssZ should indicate so in the Identify response.
https://api.nakala.fr/oai2?verb=Identify
<granularity>YYYY-MM-DD</granularity>
https://api.nakala.fr/oai2?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z
Error Code badArgument
https://www.seanoe.org/oai/OAIHandler?verb=Identify
<granularity>YYYY-MM-DD</granularity>
https://www.seanoe.org/oai/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&from=2024-11-07T14%3A40%3A49Z
<error code="badArgument">The request includes illegal arguments, is missing required arguments, includes a repeated argument, or values for arguments have an illegal syntax.</error>
There is some issue in Dataverse as it always use YYYY-MM-DDThh:mm:ssZ. I'll create an issue and look at the fix soon.
https://github.com/IQSS/dataverse/issues/11020
Philip Durbin ๐ has marked this topic as unresolved.
@luddaniel no worries and thanks for creating that issue!
Last updated: Nov 01 2025 at 14:11 UTC