Overview

DataverseFS exposes the files of a single Dataverse dataset version as an fsspec filesystem. Once you have an instance, you can browse, read, write, and delete files using the standard fsspec interface (ls, info, open, cat, rm, find, glob, …) plus a few Dataverse-specific helpers for tabular data.

It is the same machinery that powers the high-level Dataset file operations — dataset.open(...), dataset.files, dataset.upload_file(...) all delegate to a DataverseFS under the hood — but you can also use it directly whenever you want a filesystem-style view of a dataset.

Why a filesystem?

Treating a dataset as a filesystem unlocks two things that are otherwise awkward with a plain REST client:

Streaming, not buffering. Reads are served by HTTP Range requests, so seeking or reading a slice never downloads the whole file. Writes are streamed to Dataverse in bounded chunks, so uploading a large file never holds it all in memory.
The fsspec ecosystem. Importing pyDataverse registers a dataverse:// URL protocol with fsspec. Any fsspec-aware library — pandas, Dask, Polars, PyArrow, Zarr — can then read a dataset file directly from a URL, with no glue code. This is what makes pd.read_csv("dataverse://...") work.

Quickstart

from pyDataverse.filesystem import DataverseFS

fs = DataverseFS(
    base_url="https://demo.dataverse.org",
    identifier="doi:10.5072/FK2/ABCDEF",
)

fs.ls("/")                                  # list files
with fs.open("data/notes.txt", "r") as f:   # stream a file
    print(f.read())

In this section

Connecting Create a DataverseFS — directly, from a URL, or from a high-level Dataset.

Browsing & metadata List, glob, and inspect files; read rich Dataverse metadata.

Reading files Stream file content in text or binary, including byte-range reads.

Writing files Create, replace, and delete files; attach metadata on upload.

Tabular data Load ingested tabular files straight into pandas DataFrames.

pandas & the fsspec ecosystem Read datasets by URL from pandas, Dask, Polars, and the command line.