Evaluate is a library that makes evaluating and comparing models and reporting their performance easier and more standardized. It currently contains: - implementations of dozens of popular metrics: the existing metrics cover a variety of tasks spanning from NLP to Computer Vision, and include dataset-specific metrics for datasets. With a simple command like `accuracy = load("accuracy")`, get any of these metrics ready to use for evaluating a ML model in any framework (Numpy/Pandas/PyTorch/TensorFlow/JAX). - comparisons and measurements: comparisons are used to measure the difference between models and measurements are tools to evaluate datasets. - an easy way of adding new evaluation modules to the 🤗 Hub: you can create new evaluation modules and push them to a dedicated Space in the 🤗 Hub with evaluate-cli create [metric name], which allows you to see easily compare different metrics and their outputs for the same sets of references and predictions.
| Uploaded | Mon Mar 31 21:33:59 2025 |
| md5 checksum | c1ace84a2b0f3d599e6ad039abd91e5e |
| arch | x86_64 |
| build | py311h06a4308_0 |
| depends | cookiecutter, datasets >=2.0.0, dill, fsspec >=2021.05.0, huggingface_hub >=0.7.0, multiprocess, numpy >=1.17,<2.0a0, packaging, pandas, python >=3.11,<3.12.0a0, python-xxhash, requests >=2.19.0, responses <0.19, tqdm >=4.62.1 |
| license | Apache-2.0 |
| license_family | Apache |
| md5 | c1ace84a2b0f3d599e6ad039abd91e5e |
| name | evaluate |
| platform | linux |
| sha1 | e2ad29b6dfe961fa875ccb7b153cc9ad2af2c026 |
| sha256 | 72b7e46e2d9da3f89b3279608b7475bd3f9b34be0a1950a7ddb227e7b9923ad9 |
| size | 156264 |
| subdir | linux-64 |
| timestamp | 1679573745132 |
| version | 0.4.0 |