Stream: troubleshooting

Topic: ✔ metrics api and search api numbers differ


view this post on Zulip Péter Pallinger (May 06 2024 at 11:20):

I am expanding the monitoring of our dataverse instance, and have come across a curious thing: If I get the number of datasets with the metrics api, I get 72, but if I use the search api, I get 73. Has anyone seen a similar error? Maybe some types of datasets are not being shown?

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 11:24):

Interesting. Are you able to identify which one is missing from the metrics API?

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 11:24):

For the search API, are authenticating and showing drafts as well?

view this post on Zulip Péter Pallinger (May 06 2024 at 11:28):

No, I use both endpoints without authentication.
As for identifying: I am working on it, but the metrics api only displays aggregates.

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 11:29):

Oh, right.

view this post on Zulip Péter Pallinger (May 06 2024 at 11:29):

Ok, this is weird: the /monthly metrics says 73 for may, and 72 for april...

view this post on Zulip Péter Pallinger (May 06 2024 at 11:30):

https://repo.researchdata.hu/api/info/metrics/datasets
https://repo.researchdata.hu/api/info/metrics/datasets/monthly

view this post on Zulip Péter Pallinger (May 06 2024 at 11:31):

so maybe the first endpoint displays the data for the last whole month?

view this post on Zulip Péter Pallinger (May 06 2024 at 11:35):

https://repo.researchdata.hu/api/info/metrics/datasets?dataLocation=all -> 73
https://repo.researchdata.hu/api/info/metrics/datasets?dataLocation=local -> 72
https://repo.researchdata.hu/api/info/metrics/datasets?dataLocation=remote -> 0

view this post on Zulip Péter Pallinger (May 06 2024 at 11:38):

Is there a dataLocation that is not local or remote? Because the numbers seem to support that. :)

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 14:00):

It looks like you found the three possibilities:

public static final String DATA_LOCATION_LOCAL = "local";
public static final String DATA_LOCATION_REMOTE = "remote";
public static final String DATA_LOCATION_ALL = "all";

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 14:01):

From here (you seem to be on 6.1): https://github.com/IQSS/dataverse/blob/v6.1/src/main/java/edu/harvard/iq/dataverse/metrics/MetricsUtil.java#L38-L40

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 14:12):

Yes, /monthly seems to be cumulative.

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 14:17):

I was wondering if one of your 73 datasets returned by the Search API are deaccessioned (not included in metrics) but no, they are all released:

curl 'https://repo.researchdata.hu/api/search?q=*&type=dataset&per_page=100' | jq . | grep versionState

view this post on Zulip Péter Pallinger (May 06 2024 at 14:20):

Well, I will stop investigating this now (I may look into the SQL queries run later), and use the dataLocation=all option.
The difference between local and all will remain a mystery for now.

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 14:31):

I asked in the IQSS Slack. If I hear anything, I'll let you know.

view this post on Zulip Philip Durbin 🚀 (May 06 2024 at 14:52):

@Péter Pallinger what if you clear the cache for the Metrics API? Any difference?

view this post on Zulip Péter Pallinger (May 06 2024 at 20:31):

I did not find how to clear the cache, but setting MetricsCacheTimeoutMinutes=1 fixed it. There was no MetricsCacheTimeoutMinutes set, the default is probably quite long.

view this post on Zulip Notification Bot (May 06 2024 at 20:35):

Péter Pallinger has marked this topic as resolved.


Last updated: Oct 30 2025 at 06:21 UTC