Running
1
MINT
đż
Quantize LLMs to fit a memory budget
Model Quantization
We build open tools for efficient AI deployment. Our research focuses on quantization methods that preserve model quality while dramatically reducing hardware requirements â bringing 400B+ parameter models to a single machine.
baa.aiâ¡âMINT Paperâ¡âGitHub
Quantize LLMs to fit a memory budget
Quantize LLMs without data using perâtensor mixed precision
Generate quantizationâready student models via guided distillation
Train LLMs to be quantizationâready with sensitivityâaware methods