Hi, I was just playing around with the Dataverse Search API and I noticed that almost all fields are indexed in Solr as text_en fields, which doesn't allow e.g. range searches for integer and date fields. I found this relatively old (and now closed) issue from 2014 talking about using more fitting index data types: https://github.com/IQSS/dataverse/issues/370 It seems that since the date range facet was so far not implemented, the index types have not been touched yet (?) Is there any open issue currently talking about this? Or any ongoing work? :)
Yes, we should do this and no, there hasn't been any recent discussion.
@Vera Clemens are you mostly interested in date ranges?
:+1: I'm interested in both date ranges and integer ranges
Cool. Out of curiosity, are the integers in one of your custom metadata blocks? Or one of the standard blocks we ship with Dataverse?
They're in custom metadata blocks.
Ok. I'm wondering if we should also fix anything in the standard blocks. There are dates everywhere. And I believe some of the astro fields use integers.
That way, when we make a release, there are some nice examples people can see without installing custom metadata blocks.
Yes, sure! I think this could benefit integer and date fields in any metadata block.
Even without implementing any new facet with sliders, I think if the fields were indexed as integers/dates, you could run Solr range queries via the search input field (or the search API ?q=...) like integerfield:[25 TO 50] or datefield:[2000-11-01 TO 2014-12-01].
Yes, definitely. Please see https://guides.dataverse.org/en/6.3/api/search.html#date-range-search-example
And fileSizeInBytes:[32212254720 TO *] at https://github.com/IQSS/dataverse/issues/4439#issuecomment-468685228
I've been playing around with trying to index fields as something other than text_en and it seems to be working OK.
You can try it out here: http://solr-fieldtypes-test-dataverse.qa.km.k8s.zbmed.de/ (note, this is on our dev cluster that is offline during the night, so if you're in the US, it might be offline if you are checking after 2pm-ish, sorry about that)
I've added an experimental metadata block containing an integer, a float and a date field and indexed them in Solr as plong, pdouble and date_range (by just manually editing the schema.xml) and created 3 test datasets with the following values:
Here are some sample queries based on these fields:
(Filtering the date field still has some issues and needs some more testing)
Great! I'm glad the testing is going well!
Found the issue with the date field. The date fields were getting indexed as years only, without months or days (if present). It seems this is intentional: https://github.com/IQSS/dataverse/blob/050064ef264c667c2473c78b893def832c33f992/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java#L1070 Do you see any issue with changing this? I assume maybe there might be an issue if the date field is set to be facetable? That's something I didn't test yet.
Yes, probably we did that to have a reasonable (string-based) facet of four digits. But if we had a proper range facet... :grinning:
Would you want to add the UI for this in JSF? I always hesitate with this since it will be replaced by the new frontend. Would you rather see it added there instead?
Could we index the full date in "<dateFieldName>", but only the year in "<dateFieldName>_s"? From a quick test, that seems to allow proper range searches (using the [... TO ...] syntax) but keeps the facets as-is for now.
And yes, if we were to add a proper range facet to the UI, I would like to see it in the new frontend :)
Hmm, that might work.
Yep, seems to work
I've opened a PR with my changes here: https://github.com/IQSS/dataverse/pull/10887
@Vera Clemens looks great! Should the title of the PR have something about highlighting in it?
Also, can you please add a release note snippet?
Both done :smile:
Looks great. Amazing release note. Do you think it's worth it to add an API test?
Thanks!
Hm, yes maybe. I'll try and take a look on Monday!
Wonderful! Thanks!
Last updated: Nov 01 2025 at 14:11 UTC