Qwen3-Next-Thinking now updated with iMatrix! + better performance with llama.cpp
#2
pinned
by
danielhanchen
- opened
Now updated with imatrix. Quantized Qwen3-next uploads should now be much improved, especially at lower bit rates! :)
Also thanks to llama.cpp, they optimized model inference even further.
Yes you will need to redownload.
danielhanchen
pinned discussion