view article Article Introducing AutoRound: Intel’s Advanced Quantization for LLMs and VLMs +7 Apr 29 • 43
view article Article LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone! Mar 7 • 88
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 272
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 272
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context +6 Jul 23, 2024 • 239
view article Article Overview of natively supported quantization schemes in 🤗 Transformers +3 Sep 12, 2023 • 12