Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonhuggingface/lighteval

lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

81.9/100
2.4KForks: 452
View on GitHubHomepage →
Loading report...

Similar Projects

deepeval

87

The LLM Evaluation Framework

Python14.9K

mlflow

91

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

Python25.5K

ragas

73

Supercharge Your LLM Application Evaluations 🚀

Python13.6K

oumi

89

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

Python9.2K
Back to List