Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective
Reinforcement Learning for LLM Reasoning
Paper
•
2506.01939
•
Published
•
187
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient
Robotics
Paper
•
2506.01844
•
Published
•
147
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Paper
•
2506.01049
•
Published
•
38
ARIA: Training Language Agents with Intention-Driven Reward Aggregation
Paper
•
2506.00539
•
Published
•
30
Learning Video Generation for Robotic Manipulation with Collaborative
Trajectory Control
Paper
•
2506.01943
•
Published
•
25
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware
Reinforcement Learning
Paper
•
2506.01713
•
Published
•
48
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for
Language Reasoning
Paper
•
2505.24298
•
Published
•
28
MiCRo: Mixture Modeling and Context-aware Routing for Personalized
Preference Learning
Paper
•
2505.24846
•
Published
•
15
Paper
•
2506.01928
•
Published
•
9
Token-Shuffle: Towards High-Resolution Image Generation with
Autoregressive Models
Paper
•
2504.17789
•
Published
•
23
TiDAR: Think in Diffusion, Talk in Autoregression
Paper
•
2511.08923
•
Published
•
118
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding
Paper
•
2405.02384
•
Published
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
Coding Networks
Paper
•
2212.00720
•
Published
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Paper
•
2306.06101
•
Published
Hierarchical Reasoning Model
Paper
•
2506.21734
•
Published
•
46
Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training
Paper
•
2505.17638
•
Published
•
1
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
•
2508.06471
•
Published
•
195
Learning to Optimize: A Primer and A Benchmark
Paper
•
2103.12828
•
Published
•
1