Noteworthy Models 2025 (22B-36B)
Interesting models released in 2025. Tested for overall response quality, ad-hoc function calling, summary, menu navigation and creative writing.
Text Generation • 33B • Updated • 4.04M • • 594Note Last generation of Qwen-32B dense models. Good CoT / Reasoning models with all the usual bells and whistles of the Qwen series. Not that great at anything creative-writing related, but ruthlessly efficient for everything else in this size range.
mistralai/Mistral-Small-3.2-24B-Instruct-2506
24B • Updated • 129k • 525Note Decent open source 24B model from Mistral. Passed most of my tests. Obedient, very good at abiding by the system prompt. But, a lot worse at creative writing than 22B, it suffers from the same repetitiveness as other Mistral models. It's more a task oriented LLM than a generalist one.
NousResearch/Hermes-4.3-36B
Text Generation • 36B • Updated • 799 • 62Note Based on the recent Seed-OSS-36B. This is a very strong model. It's quite clever, and mostly uncensored. It shoots above its weight class in all my formal testing (agentic tasks, data retrieval, Q&A, web search, ...), and seems to be decent at more creative tasks. It uses the LLama3-Chat + Think tags format. Tested in 16K context / IQ4_XS.
Entropicengine/Pinecone-sage-24b
Text Generation • 24B • Updated • 13 • 10Note Surprisingly good merge of various Mistral 3 fine-tunes. It somehow manages to keep the base model's brain and coherence while being decent at both casual and RP chats. Excellent job. Use L7 instruct format.
maldv/QwentileLambda2.5-32B-Instruct
Text Generation • 33B • Updated • 16 • • 2Note QwQ32B and various Qwen32B 2.5 models (including the excellent Qwentile, Snowdrop, and Loqwqtus) merged in this instruct model. It's not a CoT model, well, more like it won't always do CoT if you try (still worth attempting for complex stuff). As a creative model, it's top notch, as Qwen models go in that size range. It's in ChatML.
zerofata/MS3.2-PaintedFantasy-Visage-v4-34B
34B • Updated • 22 • 12Note I normally don't like upscaled models, but this v4 (after a broken v2) is starting to show some progress. It's not that good at following instructions, but for more free form activities like chatting or creative writing, this is a very interesting model. Use L7 instruct format (sadly). There's a very decent 24B variant too.
PocketDoc/Dans-PersonalityEngine-V1.2.0-24b
Text Generation • 24B • Updated • 172 • • 169Note Good Mistral Small 24B fine-tune. It's clever, and doesn't have the usual issues plaguing MS24B models. Bonus on top, it's using ChatML instead of Mistral's terrible format. Much better than its follow up 1.3 version. A bit too much of a parrot on occasions, though.
allenai/Olmo-3-32B-Think
Text Generation • 1.05M • Updated • 7.36k • • 141Note This is a fully open source model. It's not perfect, low max context stretched via Yarn, its CoT can get really long for simple questions, it's quite unruly for dialogs and creative stuff. Still it's an event in itself, and worth keeping an eye on. It uses ChatML + Think tags.
Delta-Vector/Austral-32B-GLM4-Winton
Text Generation • 33B • Updated • 36 • 7Note The only viable GLM4 32B fine-tune currently available. It's not perfect, and sub-par at taking older dialogs into account, but it has a decent writing style, and overall okay intelligence for its size.
TheDrummer/Precog-24B-v1
415k • Updated • 332 • 29Note Original CoT RP model. Instead of doing the classic step by step, it plans its response in advance. It's definitely not helping with overall intelligence, but it seems to improve story continuity. So basically the opposite of other CoT models. It also seems to help fighting against the repetitive nature of Mistral models a bit. The CoT is short, and won't eat tons of tokens either. Worth a try. Uses L7-Thinker format.
nbeerbower/Dumpling-Qwen2.5-32B-v2
Text Generation • 33B • Updated • 5 • 2Note Decent Qwen-32B creative fine-tune. While it suffers from some of the usual Qwen-isms, it's fairly bright, and creative with a bit of guidance. Passed all my usual tasks/tests with no issues. Innate Qwen 2.5 function calling hasn't been tested.