Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonhuggingface/lighteval

lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

83.6/100
2.4KForks: 480
View on GitHubHomepage →
Loading report...

Similar Projects

deepeval

87

The LLM Evaluation Framework

Python16.1K

mlflow

91

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

Python26.4K

ragas

68

Supercharge Your LLM Application Evaluations 🚀

Python14.3K

oumi

91

Easily fine-tune, evaluate and deploy Gemma 4, Qwen3.5, Qwen3.6, gpt-oss, DeepSeek-R1, or any open source LLM / VLM!

Python9.3K
Back to List