⚠

Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.

Pythonmicrosoftarchive/promptbench

promptbench

A unified evaluation framework for large language models

54.1/100

★ 2.8KForks: 222

View on GitHub →Homepage →

Loading report...

Similar Projects

promptflow

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.

Python★ 11.2K

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python★ 7.2K

llm-guard

The Security Toolkit for LLM Interactions

Python★ 3.2K

awesome-gpt-prompt-engineering

A curated list of awesome resources, tools, and other shiny things for LLM prompt engineering.

Python★ 1.6K

← Back to List