Stream: python

Topic: setting a new description with EasyDataverse


view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 14 2025 at 20:45):

I know I can add a description to a dataset that doesn't have one like this:

dataset.citation.add_ds_description(value="This is a description of the dataset")

But if I've loaded an existing dataset (dataset = dataverse.load_dataset(pid)) how do I update the description? :thinking:

view this post on Zulip Jan Range (Jan 15 2025 at 12:42):

You can edit the object directly, by using the following:

dataset.citation.ds_description.value = "New description"

The adder functions are more or less for convenience to not have to "dig-out" the expected type and then set it to the appropriate attribute. In general, you can apply any Python OOP techniques for metadata blocks .

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 12:46):

I thought I tried that! I will, as soon as I get to my desk. Is this documented? I couldn't find it anywhere. :sweat_smile:

view this post on Zulip Jan Range (Jan 15 2025 at 12:48):

It is kind of documented within this Jupyter notebook but only for the primitive and no compound case, but that should work as the Dataverse JSON is constructed from the given Dataset object.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 12:52):

What about the fact that description is multi-valued? The code above would update the first description?

view this post on Zulip Jan Range (Jan 15 2025 at 12:54):

Oh you are right! In this case you'd need to add an index as the attribute is of type List. Here's an update:

dataset.citation.ds_description[i].value = "New description"

view this post on Zulip Jan Range (Jan 15 2025 at 12:56):

On another note, if the Citation block is consistent across versions and instances, we could add convenience functions for commonly used fields such as description and title

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 13:04):

Ah ha. Do we need to add docs for this?

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 13:05):

Technically, a description can have a date. I don't want to set a date but off the top of your head, do you think this is possible?

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 16:52):

Ok, here's the final versions that works:

# Update the metadata fields
dataset.citation.title = title
dataset.citation.subject = ["Other"]
dataset.metadatablocks["citation"].ds_description[0].value = description
dataset.metadatablocks["citation"].author[0].name = author_name
dataset.metadatablocks["citation"].author[0].affiliation = None
dataset.metadatablocks["citation"].dataset_contact[0].name = "Dataverse Support"
dataset.metadatablocks["citation"].dataset_contact[0].email = "support@dataverse.org"
dataset.metadatablocks["citation"].origin_of_sources = '<a href="' + access_link_url + '">' + access_link_name + '</a>'

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 16:53):

In short, you should use the dataset.metadatablocks trick for direct access to the part of the dict you need to change.

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 16:54):

Also this is super helpful for figuring out which values need to be set:

print(dataverse.list_metadatablocks(detailed=True))

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 16:55):

And you can print out the dataset as a dict:

print(dataset.dataverse_dict())

Or in JSON format (with optional indenting):

print(dataset.dataverse_json(4))

view this post on Zulip Philip Durbin ๐Ÿš€ (Jan 15 2025 at 16:56):

Huge thanks to @Jan Range for the live troubleshooting this morning. This is a good reason to attend the pyDataverse meeting! (#python > meetings ) :big_smile:


Last updated: Nov 01 2025 at 14:11 UTC