Minimum-distortion embedding with PyTorch
Extensible, parallel implementations of t-SNE
FlashInfer: Kernel Library for LLM Serving
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
Our first fully AI generated deep learning system