Hi there,
Do you have any idea, if back in 2014 we had a good reason to clean header/variable name for XLSX ?
varName = varName.replaceAll("[ _\t\n\r]", "");
https://github.com/IQSS/dataverse/blob/develop/src/main/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/xlsx/XLSXFileReader.java#L460
It seems not the case for CSV.
Our users don't like the fact that 2 to 3 words header name are concatenated without space.
Maybe there is a good reason at the time and maybe still valid today : database content, previewer issues, XLSX limitations...
Should we change that ?
Huh. Not sure. I clicked "blame". 11 years ago: https://github.com/IQSS/dataverse/commit/86c50ae470543b128771a7b742598428da170052
I guess we could ask Leonid about it, but he's off this week.
Last updated: Nov 01 2025 at 14:11 UTC