Stream: troubleshooting

Topic: Error while ingesting records


view this post on Zulip Thomas van Erven (Sep 11 2023 at 14:16):

I've set up an experimental stack on ODISSEI based on the development branch with our changes (xhtml files only). However, when we try to ingest via the Native API, I'm noticing this error: Failed to export the dataset as ddi. If you want, I can provide a complete dump from the logs. How can I best troubleshoot this?

view this post on Zulip Thomas van Erven (Sep 11 2023 at 14:17):

Specifically: Finalization: exception caught while exporting: Could not get prerequisite Optional[ddi] to create htmlexport for dataset 13 io.gdcc.spi.export.ExportException: Could not get prerequisite Optional[ddi] to create htmlexport for dataset 13

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 14:24):

Weird, it it ok to post here the file you're trying to ingest? (Or is it too big or restricted?)

view this post on Zulip Thomas van Erven (Sep 11 2023 at 14:26):

Totally fine, lemme ask the dev that's producing the file. I'll get one from the LISS metadata since that's open source (CBS is kinda restricted).

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 14:29):

Great. Thanks. Also, did you branch before or after we released Dataverse 6.0 on Friday? Maybe you could just tell us the last commit from the main repo. Not your xhtml changes, I mean.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 14:31):

This is the last commit I have:

commit f8bf5cb7e56150e6478b177085d4cf29d5142ca1
Merge: 2246d66601 6f7f814ff0
Author: landreev <leonid@hmdc.harvard.edu>

view this post on Zulip Thomas van Erven (Sep 11 2023 at 14:35):

Looks like I don't, looking at commits. Shall I update?

view this post on Zulip Thomas van Erven (Sep 11 2023 at 14:36):

(In fact; I might just pin on the 6.0 tag)

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 14:36):

https://github.com/IQSS/dataverse/commit/f8bf5cb7e56150e6478b177085d4cf29d5142ca1 great, thanks. I have a better sense of where you are. :big_smile: No need to update.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 14:44):

You're almost on 6.0.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 15:16):

If you want, I can pin there in case you want usable feedback specifically on that version )

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 15:18):

No, no, if anything I'd probably ask you to pin to alpha/master.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 15:35):

I pinned to development for now, with all due side effects of that (but can pin to master if you'd prefer). Open to structurally stable suggestions : )

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 15:37):

The develop branch is fine but previously you were an alpha/master guy. :big_smile:

view this post on Zulip Thomas van Erven (Sep 11 2023 at 15:46):

Oh, I'm running the alpha image. That hasn't changed.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 15:47):

It's like making beer; you need a stable base to experiment from.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 15:47):

Currently that base is open to change ; ) Everything is, until it gets users.

view this post on Zulip Oliver Bertuch (Sep 11 2023 at 16:14):

Are we sure this is container related? It sounds a bit like a general issue. Could this be tested with demo.dataverse.org? Should we move this to #dev ?

view this post on Zulip Thomas van Erven (Sep 11 2023 at 16:47):

Actually, I have some idea all of a sudden I'll tell our dev first.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 16:48):

Small explanation; I noticed collision in field names in our tsv files between core and our files; I renamed some of them and reset everything to work with new files. Could be it's breaking there.

view this post on Zulip Thomas van Erven (Sep 11 2023 at 16:59):

The dev in question uses the Native Api; quite possible that just validates schema format and the error is served later in the chain when it tries to produce a ddi file, assuming nested key/value pairs are as schema demands but in fact are different. I'd still expect that to be validated earlier, but it's a bit of an edge case.

view this post on Zulip Notification Bot (Sep 11 2023 at 19:34):

This topic was moved here from #containers > Error while ingesting records by Philip Durbin.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 19:35):

Sounds plausible.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 11 2023 at 19:35):

I did go ahead and move this to troubleshooting.

view this post on Zulip Thomas van Erven (Sep 12 2023 at 07:51):

Good one; We're trying with a dataset that doesn't hit that specific metadata block. Easy way to check.

view this post on Zulip Thomas van Erven (Sep 12 2023 at 08:20):

Still the same problem, so much for eleventh hour Ballmer peak insight. I've gotten a record which we've tried to ingest using the Native API, attached here.
dataverse_metadata_ingest_example.json

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 12 2023 at 11:13):

I see a bunch of DANS-specific metadata: dansRightsHolder, dansPersonalDataPresent, dansAudience, dansCollection, dansTemporalCoverage, dansDataversePid, dansDataversePidVersion, dansBagId, dansNbn

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 12 2023 at 11:14):

But I'm not sure why that would affect DDI export.

view this post on Zulip Thomas van Erven (Sep 12 2023 at 11:32):

Yeap, that's correct; we juggle a fair amount of metadata blocks.

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 12 2023 at 12:00):

If you remove the DANS metadata blocks from the JSON, do you still get the DDI error?

view this post on Zulip Thomas van Erven (Sep 14 2023 at 12:27):

Sorry for the delay here (had a mix of vacation days to deal with); yeah, we can confirm this is still the case. I just reset the whole stack, removed the metadata blocks by tossing the DB volume, and tried a vanilla upload. However, I note that:

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 13:05):

DDI export relies on the hostname?

view this post on Zulip Thomas van Erven (Sep 14 2023 at 13:08):

It calls the API, so practically yes. If I rewrite the correct path manually, it simply "fails" without further comment.

view this post on Zulip Thomas van Erven (Sep 14 2023 at 13:16):

(I'm assuming DDI files are statically generated; if so, that would make sense that it'd fail if the file couldn't be generated earlier)

view this post on Zulip Philip Durbin ๐Ÿš€ (Sep 14 2023 at 13:21):

Would you be able to open an issue about this?

view this post on Zulip Thomas van Erven (Sep 22 2023 at 08:52):

Yes, certainly. I can provide an env for this as well, plus credentials.

view this post on Zulip Thomas van Erven (Oct 12 2023 at 11:58):

I forgot to update this topic; but it works (I think). At least it does manually.

view this post on Zulip Thomas van Erven (Oct 12 2023 at 11:58):

Can be resolved.

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 12 2023 at 12:24):

Great!

view this post on Zulip Philip Durbin ๐Ÿš€ (Oct 12 2023 at 12:24):

Is there anything we should document? If so, please feel free to great an issue.


Last updated: Oct 30 2025 at 05:14 UTC