Back to List
Notice:This resource is provided by a third-party author. Please review the code with AI tools or manually before use to ensure security and compatibility.
Pythonintel/neural-compressor

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

89.8/100
2.7KForks: 306
View on GitHubHomepage →
Loading report...

Similar Projects

nncf

81

Neural Network Compression Framework for enhanced OpenVINO™ inference

Python1.2K

LightCompress

67

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.

Python721

Chinese-LLaMA-Alpaca

79

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python18.9K

langflow

95

Langflow is a powerful tool for building and deploying AI-powered agents and workflows.

Python149.5K
Back to List