Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 11 days ago • 85
TaH Collection Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated 13 days ago • 2
TaH Collection Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated 13 days ago • 2
TaH Collection Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated 13 days ago • 2
TaH Collection Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated 13 days ago • 2