Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonvllm-project/vllm

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

92.7/100
77.9KForks: 16.0K
View on GitHubHomepage →
Loading report...

Similar Projects

sglang

91

SGLang is a high-performance serving framework for large language models and multimodal models.

Python26.3K

LMCache

87

Supercharge Your LLM with the Fastest KV Cache Layer

Python8.1K

nano-vllm

66

Nano vLLM

Python13.1K

text-generation-inference

78

Large Language Model Text Generation Inference

Python10.8K
Back to List