Training/Fine-tuning at the speed of light
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk
A highly optimized LLM inference acceleration engine for Llama and its variants.
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.
High-speed Large Language Model Serving for Local Deployment