CocoaBench: Evaluating Unified Digital Agents in the Wild Paper • 2604.11201 • Published 11 days ago • 35
A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning Paper • 2507.14295 • Published Jul 18, 2025 • 14
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning Paper • 2504.20073 • Published Apr 24, 2025 • 12