Running on CPU Upgrade Featured 2.55k The Smol Training Playbook 📚 2.55k The secrets to building world-class LLMs
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 17 items • Updated 21 days ago • 51
Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 Image-Text-to-Text • 31B • Updated 12 days ago • 226k • 90
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated 12 days ago • 69.7k • • 325
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9 • 49