Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
C++jmaczan/tiny-vllm
tiny-vllm
Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM