Unsloth AI

Team

company

Verified

https://unsloth.ai

UnslothAI

unslothai

Activity Feed

AI & ML interests

Open Source AI 💚

Recent Activity

danielhanchen updated a model about 6 hours ago

unsloth/GLM-4.6V-Flash-GGUF

danielhanchen updated a model about 8 hours ago

unsloth/GLM-4.6V-Flash

danielhanchen published a model about 8 hours ago

unsloth/GLM-4.6V-Flash

View all activity

danielhanchen

updated a model about 6 hours ago

unsloth/GLM-4.6V-Flash-GGUF

Image-Text-to-Text • 9B • Updated about 6 hours ago • 1

danielhanchen

updated a model about 8 hours ago

unsloth/GLM-4.6V-Flash

Image-Text-to-Text • 10B • Updated about 8 hours ago

danielhanchen

published 2 models about 8 hours ago

unsloth/GLM-4.6V-Flash

Image-Text-to-Text • 10B • Updated about 8 hours ago

unsloth/GLM-4.6V-Flash-GGUF

Image-Text-to-Text • 9B • Updated about 6 hours ago • 1

danielhanchen

updated a model about 19 hours ago

unsloth/Mistral-Large-3-675B-Instruct-2512-GGUF

673B • Updated about 19 hours ago • 8.88k • 4

danielhanchen

in unsloth/Qwen3-Next-80B-A3B-Thinking-GGUF 1 day ago

Qwen3-Next-Thinking now updated with iMatrix! + better performance with llama.cpp

🚀 3

#2 opened 1 day ago by

danielhanchen

in unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF 1 day ago

Qwen3-Next updated with iMatrix + Improved performance!

👍 1

#3 opened 1 day ago by

danielhanchen

posted an update 6 days ago

Post

3221

Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF

🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3

3 replies

danielhanchen

posted an update 10 days ago

Post

8226

Qwen3-Next can now be Run locally! (30GB RAM)
Instruct GGUF: unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF

The models come in Thinking and Instruct versions and utilize a new architecture, allowing it to have ~10x faster inference than Qwen32B.
💜 Step-by-step Guide: https://docs.unsloth.ai/models/qwen3-next

Thinking GGUF: unsloth/Qwen3-Next-80B-A3B-Thinking-GGUF

danielhanchen

posted an update about 1 month ago

Post

4203

You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs: unsloth/Kimi-K2-Thinking-GGUF

We shrank the 1T model to 245GB (-62%) & retained ~85% of accuracy on Aider Polyglot. Run on >247GB RAM for fast inference.

We also collaborated with the Moonshot AI Kimi team on a system prompt fix! 🥰

Guide + fix details: https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally

danielhanchen

posted an update 4 months ago

Post

6470

Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!🐋
GGUFs: unsloth/DeepSeek-V3.1-GGUF

The 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.

The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.

Guide: https://docs.unsloth.ai/basics/deepseek-v3.1

danielhanchen

posted an update 4 months ago

Post

5537

Run OpenAI's new gpt-oss models locally with Unsloth GGUFs! 🔥🦥
20b GGUF: unsloth/gpt-oss-20b-GGUF
120b GGUF: unsloth/gpt-oss-120b-GGUF

Model will run on 14GB RAM for 20b and 66GB for 120b.

2 replies

danielhanchen

posted an update 5 months ago

Post

3665

It's Qwen3 week! 💜 We uploaded Dynamic 2-bit GGUFs for:

Qwen3-Coder: unsloth/Qwen3-Coder-480B-A35B-Instruct-GGUF
Qwen3-2507: unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF

So you can run them both locally!
Guides are in model cards.

1 reply

danielhanchen

posted an update 5 months ago

Post

3884

Made some 245GB (80% size reduction) 1.8bit quants for Kimi K2!

unsloth/Kimi-K2-Instruct-GGUF

danielhanchen

posted an update 5 months ago

Post

3972

We fixed more issues! Use --jinja for all!
* Fixed Nanonets OCR-s unsloth/Nanonets-OCR-s-GGUF
* Fixed THUDM GLM-4 unsloth/GLM-4-32B-0414-GGUF
* DeepSeek Chimera v2 is uploading! unsloth/DeepSeek-TNG-R1T2-Chimera-GGUF

danielhanchen

posted an update 5 months ago

Post

3181

Gemma 3n finetuning is now 1.5x faster and uses 50% less VRAM in Unsloth!

Click "Use this model" and click "Google Colab"!

unsloth/gemma-3n-E4B-it

unsloth/gemma-3n-E2B-it

https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Gemma3N_(4B)-Conversational.ipynb

2 replies

danielhanchen

posted an update 6 months ago

Post

1319

We updated lots of our GGUFs and uploaded many new ones!
* unsloth/dots.llm1.inst-GGUF
* unsloth/Jan-nano-GGUF
* unsloth/Nanonets-OCR-s-GGUF
* Updated and fixed Q8_0 upload for unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
* Added Q2_K_XL for unsloth/DeepSeek-R1-0528-GGUF
* Updated and fixed Vision support for unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF

danielhanchen

posted an update 6 months ago

Post

2524

Mistral releases Magistral, their new reasoning models! 🔥
GGUFs to run: unsloth/Magistral-Small-2506-GGUF

Magistral-Small-2506 excels at mathematics and coding.

You can run the 24B model locally with just 32GB RAM by using our Dynamic GGUFs.

danielhanchen

posted an update 6 months ago

Post

3857

New DeepSeek-R1-0528 1.65-bit Dynamic GGUF!

Run the model locally even easier! Will fit on a 192GB Macbook and run at 7 tokens/s.

DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-GGUF
Qwen3-8B DeepSeek-R1-0528 GGUFs: unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF

And read our Guide: https://docs.unsloth.ai/basics/deepseek-r1-0528

danielhanchen

posted an update 7 months ago

Post

2331

💜 Qwen3 128K Context Length: We've released Dynamic 2.0 GGUFs + 4-bit safetensors!
Fixed: Now works on any inference engine and fixed issues with the chat template.
Qwen3 GGUFs:
30B-A3B: unsloth/Qwen3-30B-A3B-GGUF
235-A22B: unsloth/Qwen3-235B-A22B-GGUF
32B: unsloth/Qwen3-32B-GGUF

Read our guide on running Qwen3 here: https://docs.unsloth.ai/basics/qwen3-how-to-run-and-finetune

128K Context Length:
30B-A3B: unsloth/Qwen3-30B-A3B-128K-GGUF
235-A22B: unsloth/Qwen3-235B-A22B-128K-GGUF
32B: unsloth/Qwen3-32B-128K-GGUF

All Qwen3 uploads: unsloth/qwen3-680edabfb790c8c34a242f95

AI & ML interests

Recent Activity

Team members 2

unsloth's activity

Qwen3-Next-Thinking now updated with iMatrix! + better performance with llama.cpp

Qwen3-Next updated with iMatrix + Improved performance!