LLM KV cache compression made easy
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Machine Learning Engineering Open Book
Supercharge Your LLM with the Fastest KV Cache Layer
Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere