:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
Fast Multimodal LLM on Mobile Devices
Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.
High-speed Large Language Model Serving for Local Deployment
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk