Qwen3-Next-Thinking now updated with iMatrix! + better performance with llama.cpp

pinned

by danielhanchen - opened 2 days ago

Unsloth AI org 2 days ago

Now updated with imatrix. Quantized Qwen3-next uploads should now be much improved, especially at lower bit rates! :)

Also thanks to llama.cpp, they optimized model inference even further.

Yes you will need to redownload.

danielhanchen pinned discussion 2 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment