Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonmicrosoftarchive/promptbench

promptbench

A unified evaluation framework for large language models

55.7/100
2.8KForks: 221
View on GitHubHomepage →
Loading report...

Similar Projects

promptflow

89

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.

Python11.1K

opencompass

85

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python7.1K

llm-guard

55

The Security Toolkit for LLM Interactions

Python3.1K

awesome-gpt-prompt-engineering

53

A curated list of awesome resources, tools, and other shiny things for LLM prompt engineering.

Python1.6K
Back to List