k6 k xl ud is bugged

#2
by maxpayne07 - opened

k6 k xl ud is bugged. all the rest is fine

Unsloth AI org

Hey guys we re-uploaded them, can you guys check again, thanks so much! πŸ™

CC: @maxpayne07

That was fast!!! Outstanding :)

This comment has been hidden (marked as Resolved)

Is this model really that bad at detecting multiple celebs in one image? Can someone confirm? Gemini 2.5 flash easily gets it right.
why-are-gen-z-celebs-much-plainer-looking-in-comparison-to-v0-ngelsraqp7me1

Its free brother. I am not seeing Vision models able to run at home from USA or Europe .... At least that good

Is this model really that bad at detecting multiple celebs in one image? Can someone confirm? Gemini 2.5 flash easily gets it right.
why-are-gen-z-celebs-much-plainer-looking-in-comparison-to-v0-ngelsraqp7me1

I've found the entire Qwen3-VL series to be the most accurate vision models I've ever used, even beating Gemini, ChatGPT, etc. I ran your image through Qwen-3-VL-30B-A3B-Instruct. I know this is regarding the thinking variant, but I can't load it atm due to other users currently using the Instruct variant on my server. It did really well, only a couple mistakes - completely reasonable for a model this size imo. Both the Instruct & Thinking versions of VL-30B-A3B have been doing extremely good for multimodal use in my experience. (I'm using the FP16 mmproj instead of FP32, btw)

image

Maybe try implementing the Qwen-Agent "Zoom in tool" that allows the model to zoom in on different aspects of the image for better overal analysis.

I wouldn't call it "couple mistakes". It's less than 50% to me. While gemini flash 100% correct

Sign up or log in to comment