🪣 We just shipped Storage Buckets: S3-like mutable storage, cheaper & faster Git falls short for everything on high-throughput side of AI (checkpoints, processed data, agent traces, logs etc) Buckets fixes that: fast writes, overwrites, directory sync 💨 All powered by Xet dedup so successive checkpoints skip the bytes that already exist ➡️
Hugging Face Launches Storage Buckets for High-Throughput ML Workflows
Hugging Face· Updated
Hugging Face shipped Storage Buckets — mutable, S3-like object storage on the Hub for checkpoints, agent traces, and processed datasets. Xet chunk-based deduplication means successive checkpoints skip bytes already stored, cutting bandwidth and transfer time.
hf://buckets/username/bucket-name, they're manageable via the hf CLI or huggingface_hub library (since v1.5.0).Buckets run on Xet, Hugging Face's chunk-based storage backend — successive checkpoints with frozen model sections skip bytes already stored, cutting bandwidth and storage footprint. Enterprise billing is based on deduplicated storage, so shared chunks reduce costs. Hugging Face partners with AWS and GCP for pre-warming, bringing data close to compute before training runs.
Buckets are included in existing Hub storage plans — free accounts get storage to start; PRO and Enterprise offer higher limits. Sync a checkpoint directory with hf buckets sync or access bucket contents from any fsspec-compatible library via hf:// paths.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →




