Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization Paper โข 2510.13554 โข Published Oct 15 โข 57
Rope to Nope and Back Again: A New Hybrid Attention Strategy Paper โข 2501.18795 โข Published Jan 30 โข 12
Running 3.56k The Ultra-Scale Playbook ๐ 3.56k The ultimate guide to training LLM on large GPU Clusters