Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 11 days ago • 13
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 11 days ago • 13
Reward Models 10-2025 Collection A collection of great reward models for research and production • 7 items • Updated 11 days ago • 13
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published Oct 21, 2025 • 13
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge Paper • 2510.18941 • Published Oct 21, 2025 • 13
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle Text Generation • 71B • Updated Oct 30, 2025 • 818 • 6
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 9
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 9
RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards Paper • 2509.21319 • Published Sep 25, 2025 • 9 • 2