k6 k xl ud is bugged

by maxpayne07 - opened 26 days ago

Discussion

maxpayne07

26 days ago

k6 k xl ud is bugged. all the rest is fine

shimmyshimmer

Unsloth AI org 25 days ago

Hey guys we re-uploaded them, can you guys check again, thanks so much! 🙏

CC: @maxpayne07

maxpayne07

25 days ago

That was fast!!! Outstanding :)

urtuuuu

25 days ago

This comment has been hidden (marked as Resolved)

urtuuuu

25 days ago

•

edited 25 days ago

Is this model really that bad at detecting multiple celebs in one image? Can someone confirm? Gemini 2.5 flash easily gets it right.

maxpayne07

25 days ago

Its free brother. I am not seeing Vision models able to run at home from USA or Europe .... At least that good

ayylmaonade

24 days ago

•

edited 24 days ago

Is this model really that bad at detecting multiple celebs in one image? Can someone confirm? Gemini 2.5 flash easily gets it right.

I've found the entire Qwen3-VL series to be the most accurate vision models I've ever used, even beating Gemini, ChatGPT, etc. I ran your image through Qwen-3-VL-30B-A3B-Instruct. I know this is regarding the thinking variant, but I can't load it atm due to other users currently using the Instruct variant on my server. It did really well, only a couple mistakes - completely reasonable for a model this size imo. Both the Instruct & Thinking versions of VL-30B-A3B have been doing extremely good for multimodal use in my experience. (I'm using the FP16 mmproj instead of FP32, btw)

Maybe try implementing the Qwen-Agent "Zoom in tool" that allows the model to zoom in on different aspects of the image for better overal analysis.

urtuuuu

24 days ago

I wouldn't call it "couple mistakes". It's less than 50% to me. While gemini flash 100% correct

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment