Here's a potentially stupid idea. How about we look into using zstd compression for downloads and uploads? In case of many small files, downloads can be very slow due to how HTTP works. Wouldn't it be nice to enable sending files over the wire as streamable packages? Example: a dataset contains a huge number of small files. To download them all as ZIP we need to download all of the files first, create the ZIP on the server and send it over the wire. Wouldn't it be nice to make use of a compressed stream here?
I'm also thinking about the new dataverse pyfilesystem, dealing with accessing a dataset with huge amounts of small files...
^^ @Jan Range
How well does S3 handle down/uploading loads of small files?
/me looks at https://en.wikipedia.org/wiki/Zstd and https://facebook.github.io/zstd/
I'm all for anything that speeds things up
I agree with Phil, the faster the better :grinning:
Last updated: Nov 01 2025 at 14:11 UTC