JacobHicks
's Collections
read later
updated
Sharing is Caring: Efficient LM Post-Training with Collective RL
Experience Sharing
Paper
•
2509.08721
•
Published
•
660
A.S.E: A Repository-Level Benchmark for Evaluating Security in
AI-Generated Code
Paper
•
2508.18106
•
Published
•
347
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action
Model
Paper
•
2509.09372
•
Published
•
243
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
•
2509.02547
•
Published
•
228
A Survey of Reinforcement Learning for Large Reasoning Models
Paper
•
2509.08827
•
Published
•
190
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper
•
2509.03867
•
Published
•
210
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
•
2508.05748
•
Published
•
141
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
•
2509.13313
•
Published
•
80
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning
Paper
•
2509.13305
•
Published
•
91
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement
Learning
Paper
•
2509.22647
•
Published
•
32
Scaling Agents via Continual Pre-training
Paper
•
2509.13310
•
Published
•
117
PaddleOCR 3.0 Technical Report
Paper
•
2507.05595
•
Published
•
18
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI
Agents
Paper
•
2509.06917
•
Published
•
41
AgentScope 1.0: A Developer-Centric Framework for Building Agentic
Applications
Paper
•
2508.16279
•
Published
•
53
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
175