view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms 17 days ago • 29
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 Dec 9, 2022 • 376
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 • 89
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1 • 62
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1 • 93
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution Paper • 2507.23348 • Published Jul 31 • 11
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published Jul 14 • 89
An Empirical Study of Using Large Language Models for Unit Test Generation Paper • 2305.00418 • Published Apr 30, 2023 • 2
TESTEVAL: Benchmarking Large Language Models for Test Case Generation Paper • 2406.04531 • Published Jun 6, 2024 • 1