datasets
HuggingFace/Datasets is an open library of NLP datasets.
HuggingFace/Datasets is an open library of NLP datasets.
To install this package, run one of the following:
Datasets is a lightweight library providing one-line dataloaders for many public datasets and one liners to download and pre-process any of the number of datasets major public datasets provided on the HuggingFace Datasets Hub. Datasets are ready to use in a dataloader for training/evaluating a ML model (Numpy/Pandas/PyTorch/TensorFlow/JAX). Datasets also provide an API for simple, fast, and reproducible data pre-processing for the above public datasets as well as your own local datasets in CSV/JSON/text.
Summary
HuggingFace/Datasets is an open library of NLP datasets.
Information Last Updated
Aug 21, 2025 at 06:31
License
Apache-2.0
Total Downloads
773
Platforms
GitHub Repository
https://github.com/huggingface/datasetsDocumentation
https://huggingface.co/docs/datasets/