arxiv:2311.01056
Yihong Wu
Yihong7788
AI & ML interests
None yet
Recent Activity
commentedon a paper about 1 month ago
It Takes Two: Your GRPO Is Secretly DPO upvoted a paper 6 months ago
It Takes Two: Your GRPO Is Secretly DPO upvoted a paper 6 months ago
On Predictability of Reinforcement Learning Dynamics for Large Language
ModelsOrganizations
None yet