deepeval
The LLM Evaluation Framework
The LLM Evaluation Framework
To install this package, run one of the following:
DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating large-language model systems. It is similar to Pytest but specialized for unit testing LLM apps. DeepEval incorporates the latest research to run evals via metrics such as G-Eval, task completion, answer relevancy, hallucination, etc., which uses LLM-as-a-judge and other NLP models that run locally on your machine.
Summary
The LLM Evaluation Framework
Last Updated
May 6, 2026 at 15:09
License
Apache-2.0
Supported Platforms
GitHub Repository
https://github.com/confident-ai/deepevalDocumentation
https://deepeval.com