Datasets is a lightweight library providing two main features: - one-line dataloaders for many public datasets: one-liners to download and pre-process any of the number of datasets major public datasets (text datasets in 467 languages and dialects, image datasets, audio datasets, etc.) provided on the HuggingFace Datasets Hub. With a simple command like squad_dataset = load_dataset("squad"), get any of these datasets ready to use in a dataloader for training/evaluating a ML model (Numpy/Pandas/PyTorch/TensorFlow/JAX), - efficient data pre-processing: simple, fast and reproducible data pre-processing for the above public datasets as well as your own local datasets in CSV/JSON/text/PNG/JPEG/etc. With simple commands like `processed_dataset = dataset.map(process_example)`, efficiently prepare the dataset for inspection and ML model evaluation and training.

Uploaded	Mon Mar 31 21:27:32 2025
md5 checksum	67fb607b3dbd6f3c353faa98eb97b486
arch	x86_64
build	py311h06a4308_0
depends	aiohttp, dill >=0.3.0,<0.3.7, fsspec >=2021.11.1, huggingface_hub >=0.11.0,<1.0.0, multiprocess, numpy >=1.17,<2.0a0, packaging, pandas, pyarrow >=8.0.0, python >=3.11,<3.12.0a0, python-xxhash, pyyaml >=5.1, requests >=2.19.0, responses <0.19, tqdm >=4.62.1
license	Apache-2.0
license_family	Apache
md5	67fb607b3dbd6f3c353faa98eb97b486
name	datasets
platform	linux
sha1	e96a08c4b1e7520e7289d043f92543352eed52e2
sha256	e54cec96b0179e8bb617a91140980b644ed3cfbdc6328114b3559e278aad355b
size	880849
subdir	linux-64
timestamp	1684482967044
version	2.12.0

linux-64/datasets-2.12.0-py311h06a4308_0.conda