Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
PythonNVIDIA/kvpress

kvpress

LLM KV cache compression made easy

79.4/100
1.1KForks: 150
View on GitHub
Loading report...

Similar Projects

peft

91

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python21.3K

ml-engineering

72

Machine Learning Engineering Open Book

Python18.1K

LMCache

88

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python8.5K

parallax

74

Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere

Python1.3K
Back to List