-
BroRL: Scaling Reinforcement Learning via Broadened Exploration
Paper • 2510.01180 • Published • 18 -
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Paper • 2510.03632 • Published • 41 -
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Paper • 2509.25849 • Published • 47 -
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR
Paper • 2509.23808 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2504.12216
-
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
GSAI-ML/LLaDA-8B-Instruct
Text Generation • 8B • Updated • 239k • 337 -
Dream-org/Dream-v0-Base-7B
Text Generation • 8B • Updated • 346k • 51 -
Dream-org/Dream-v0-Instruct-7B
Text Generation • 8B • Updated • 90.6k • 144
-
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Paper • 2509.19371 • Published -
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Paper • 2505.06708 • Published • 7 -
Selective Attention: Enhancing Transformer through Principled Context Control
Paper • 2411.12892 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 189
-
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Paper • 2504.12216 • Published • 3 -
Unifying Autoregressive and Diffusion-Based Sequence Generation
Paper • 2504.06416 • Published • 3 -
The Diffusion Duality
Paper • 2506.10892 • Published • 37 -
Anchored Diffusion Language Model
Paper • 2505.18456 • Published • 1
-
BroRL: Scaling Reinforcement Learning via Broadened Exploration
Paper • 2510.01180 • Published • 18 -
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Paper • 2510.03632 • Published • 41 -
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
Paper • 2509.25849 • Published • 47 -
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR
Paper • 2509.23808 • Published • 47
-
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Paper • 2509.19371 • Published -
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Paper • 2505.06708 • Published • 7 -
Selective Attention: Enhancing Transformer through Principled Context Control
Paper • 2411.12892 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 189
-
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
GSAI-ML/LLaDA-8B-Instruct
Text Generation • 8B • Updated • 239k • 337 -
Dream-org/Dream-v0-Base-7B
Text Generation • 8B • Updated • 346k • 51 -
Dream-org/Dream-v0-Instruct-7B
Text Generation • 8B • Updated • 90.6k • 144
-
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Paper • 2504.12216 • Published • 3 -
Unifying Autoregressive and Diffusion-Based Sequence Generation
Paper • 2504.06416 • Published • 3 -
The Diffusion Duality
Paper • 2506.10892 • Published • 37 -
Anchored Diffusion Language Model
Paper • 2505.18456 • Published • 1