I am expanding the monitoring of our dataverse instance, and have come across a curious thing: If I get the number of datasets with the metrics api, I get 72, but if I use the search api, I get 73. Has anyone seen a similar error? Maybe some types of datasets are not being shown?
Interesting. Are you able to identify which one is missing from the metrics API?
For the search API, are authenticating and showing drafts as well?
No, I use both endpoints without authentication.
As for identifying: I am working on it, but the metrics api only displays aggregates.
Oh, right.
Ok, this is weird: the /monthly metrics says 73 for may, and 72 for april...
https://repo.researchdata.hu/api/info/metrics/datasets
https://repo.researchdata.hu/api/info/metrics/datasets/monthly
so maybe the first endpoint displays the data for the last whole month?
https://repo.researchdata.hu/api/info/metrics/datasets?dataLocation=all -> 73
https://repo.researchdata.hu/api/info/metrics/datasets?dataLocation=local -> 72
https://repo.researchdata.hu/api/info/metrics/datasets?dataLocation=remote -> 0
Is there a dataLocation that is not local or remote? Because the numbers seem to support that. :)
It looks like you found the three possibilities:
public static final String DATA_LOCATION_LOCAL = "local";
public static final String DATA_LOCATION_REMOTE = "remote";
public static final String DATA_LOCATION_ALL = "all";
From here (you seem to be on 6.1): https://github.com/IQSS/dataverse/blob/v6.1/src/main/java/edu/harvard/iq/dataverse/metrics/MetricsUtil.java#L38-L40
Yes, /monthly seems to be cumulative.
I was wondering if one of your 73 datasets returned by the Search API are deaccessioned (not included in metrics) but no, they are all released:
curl 'https://repo.researchdata.hu/api/search?q=*&type=dataset&per_page=100' | jq . | grep versionState
Well, I will stop investigating this now (I may look into the SQL queries run later), and use the dataLocation=all option.
The difference between local and all will remain a mystery for now.
I asked in the IQSS Slack. If I hear anything, I'll let you know.
@Péter Pallinger what if you clear the cache for the Metrics API? Any difference?
I did not find how to clear the cache, but setting MetricsCacheTimeoutMinutes=1 fixed it. There was no MetricsCacheTimeoutMinutes set, the default is probably quite long.
Péter Pallinger has marked this topic as resolved.
Last updated: Oct 30 2025 at 06:21 UTC