does not work with ollama 0.15.5 - 0.15.6
(base) ➜ ~ ollama --version
ollama version is 0.15.6
(base) ➜ ~ ollama run hf.co/unsloth/Qwen3-Coder-Next-GGUF:Q4_K_M
Error: 500 Internal Server Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ssm_in.weight'
llama_model_load_from_file_impl: failed to load model
Same here. I'm trying to run this on a RTX Pro 6000 96G:
$ docker exec -it ollama ollama run hf.co/unsloth/Qwen3-Coder-Next-GGUF:latest
Error: 500 Internal Server Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ssm_in.weight'
$ docker exec -it ollama ollama run hf.co/unsloth/Qwen3-Coder-Next-GGUF:Q4_K_M
Error: 500 Internal Server Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ssm_in.weight'
$ docker exec -it ollama ollama --version
ollama version is 0.15.6
llama_model_load: error loading model: missing tensor 'blk.0.ssm_in.weight'
llama_model_load_from_file_impl: failed to load model
panic: unable to load model: /ollama_models/blobs/sha256-eab53ec181795fd2b35cf875ccaa76cb19bab27f7b3d47ffc0cafbc5e196ecb2
and strings /ollama_models/blobs/sha256-eab53ec181795fd2b35cf875ccaa76cb19bab27f7b3d47ffc0cafbc5e196ecb2 | grep "blk.0.ssm_in.weight" doesnt return anything, maybe its actually missing?
INFO:gguf-dump:* Loading: /ollama_models/blobs/sha256-eab53ec181795fd2b35cf875ccaa76cb19bab27f7b3d47ffc0cafbc5e196ecb2
4: 8388608 | 2048, 4096, 1, 1 | Q4_K | blk.0.attn_gate.weight
5: 2048 | 2048, 1, 1, 1 | F32 | blk.0.attn_norm.weight
6: 16777216 | 2048, 8192, 1, 1 | Q5_K | blk.0.attn_qkv.weight
7: 536870912 | 512, 2048, 512, 1 | Q6_K | blk.0.ffn_down_exps.weight
8: 1048576 | 512, 2048, 1, 1 | Q6_K | blk.0.ffn_down_shexp.weight
9: 536870912 | 2048, 512, 512, 1 | Q4_K | blk.0.ffn_gate_exps.weight
10: 1048576 | 2048, 512, 1, 1 | F32 | blk.0.ffn_gate_inp.weight
11: 2048 | 2048, 1, 1, 1 | BF16 | blk.0.ffn_gate_inp_shexp.weight
12: 1048576 | 2048, 512, 1, 1 | Q5_K | blk.0.ffn_gate_shexp.weight
13: 536870912 | 2048, 512, 512, 1 | Q4_K | blk.0.ffn_up_exps.weight
14: 1048576 | 2048, 512, 1, 1 | Q5_K | blk.0.ffn_up_shexp.weight
15: 2048 | 2048, 1, 1, 1 | F32 | blk.0.post_attention_norm.weight
16: 32 | 32, 1, 1, 1 | F32 | blk.0.ssm_a
17: 131072 | 2048, 64, 1, 1 | Q4_K | blk.0.ssm_ba.weight
18: 32768 | 4, 8192, 1, 1 | F32 | blk.0.ssm_conv1d.weight
19: 32 | 32, 1, 1, 1 | F32 | blk.0.ssm_dt.bias
20: 128 | 128, 1, 1, 1 | F32 | blk.0.ssm_norm.weight
21: 8388608 | 4096, 2048, 1, 1 | Q4_K | blk.0.ssm_out.weight
no blk.0.ssm_in.weight
same problem here with ollama v0.15.6
Starting from late last year, GGUFs don't work out of the gate with Ollama anymore, so, at the moment we only recommend using GGUFs with llama.cpp compatible backends
Starting from late last year, GGUFs don't work out of the gate with Ollama anymore, so, at the moment we only recommend using GGUFs with llama.cpp compatible backends
Thanks, any clue if they will update it?
ollama --version
ollama version is 0.16.2
ollama run hf.co/lovedheart/Qwen3-Coder-Next-REAP-60B-A3B-GGUF:Q3_K_XL
Error: 500 Internal Server Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ssm_in.weight'
The same issue is still present for Ollama version 0.17.0
Same here - Ollama latest version
I remembered it because ollama do not support the latest qunatize format, I use LM Studio instead.