Stream: troubleshooting

Topic: Data upload question - how to add searchable fields


view this post on Zulip Nir Harel (Jan 14 2025 at 16:16):

Hello,
I hope I'm writing on the right channel,
I need to upload data to a dataverse db (Max Plank Edmond) I have a json file (that I can change to a csv) of values that describe the experiment/file I'm uploading, I want to be able to search the files I've uploaded based on those values,

I saw that if I upload a csv its is translated into tabular data (variables?) can I use the API to search using these values ?

Or what is the right way to add fields that describe my data (if it is possible)

Thanks
Nir Harel

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 14 2025 at 16:27):

Yes, if the CSV is successfully ingested, the column names will be searchable. However, the values won't be.

To make data files searchable generally, you should turn on full text indexing. Here's how: https://guides.dataverse.org/en/6.5/installation/config.html#solrfulltextindexing

view this post on Zulip Nir Harel (Jan 14 2025 at 16:32):

There is no way to use an api to access/search the values of the fields, or create similar custom searchable values ?

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 14 2025 at 16:34):

Well, searching and accessing are different. I was talking about searching. You want to access the JSON file? (Or CSV if you switch format?) You can certainly download any file you upload.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 14 2025 at 16:40):

As for describing a data file, you could upload an auxiliary file or two to describe it: https://guides.dataverse.org/en/6.5/developers/aux-file-support.html

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 14 2025 at 16:42):

If you look at https://demo.dataverse.org/dataverse/demo/search and scroll down to "Files" you see that ordinarily, only a few field are searchable for files: name, description, file type, etc. This is all metadata about files we keep in the database.

That "full text indexing" feature I mentioned is our main way of making the content of the file searchable (via the UI and API).

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 14 2025 at 16:43):

I hope this is helping! :sweat_smile: Please keep the questions coming.

view this post on Zulip Nir Harel (Jan 15 2025 at 10:54):

Thanks, it helps but I still need to find a way to search by values that describe the files (I have about 20 values that describe each file) can you think of an option to make that possible (from what I understand - the aux file content is not searchable) or is my only option is full text search

Thanks for the help

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 12:40):

Right, the aux file are not searchable.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 12:42):

Really, the only way to make the contents of the file searchable is through full text indexing.

A small exception to this is that the "variable name" (column name) and "variable label" (column description) of ingested tabular files are searchable without having full text indexing enabled.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 12:43):

One thought: would you have any interest in adding some sample data to https://github.com/IQSS/dataverse-sample-data ? We use this repo for testing.


Last updated: Oct 30 2025 at 06:21 UTC