⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

Pythonhuggingface/lighteval

lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

81.9/100

★ 2.4KForks: 452

View on GitHub →Homepage →

Loading report...

Similar Projects

deepeval

The LLM Evaluation Framework

Python★ 14.9K

mlflow

The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.

Python★ 25.5K

ragas

Supercharge Your LLM Application Evaluations 🚀

Python★ 13.6K

oumi

Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!

Python★ 9.2K

← Back to List