Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
C++cactus-compute/cactus

cactus

Low-latency AI engine for mobile devices & wearables

86.1/100
5.3KForks: 427
View on GitHubHomepage →
Loading report...

Similar Projects

runanywhere-sdks

80

Production ready toolkit to run AI locally

C++10.3K

distributed-llama

70

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

C++2.9K

llama.rn

74

React Native binding of llama.cpp

C++967

tiny-vllm

52

Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM

C++776
Back to List