Stream: troubleshooting

Topic: STATA file issue


view this post on Zulip Bethany Seeger (Nov 24 2025 at 21:15):

Hi,
On our test server there is a STATA file that failed to convert to tab when being ingested. The information on the page shows this:

image.png

I'm doing some debugging and can't figure out what GSO stands for. I'm guessing, based on this message, that the file is somehow corrupt? I don't have access to STATA to read it, but did write a quick python script and that showed this message:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 16: invalid start byte

Any advice?
Thanks!

view this post on Zulip Philip Durbin 🚀 (Nov 24 2025 at 21:59):

Hmm, my first thought is to plug your file into a test at https://github.com/IQSS/dataverse/tree/v6.8/src/test/java/edu/harvard/iq/dataverse/ingest/tabulardata/impl/plugins/dta (but I'm not sure which one).

These are unit tests. So you don't have to run Payara, etc.


Last updated: Jan 09 2026 at 14:18 UTC