2 8

Haizhong

haizhongzheng

http://zhenghaizhong.com/

haizhongzheng

AI & ML interests

Efficient machine learning

Recent Activity

upvoted an article 14 days ago

Forge: Scalable Agent RL Framework and Algorithm

upvoted a paper 3 months ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

upvoted a paper 4 months ago

When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?

View all activity

Organizations

upvoted an article 14 days ago

Article

Forge: Scalable Agent RL Framework and Algorithm

15 days ago

•

126

upvoted a paper 3 months ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published Nov 11, 2025 • 108

upvoted a paper 4 months ago

When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?

Paper • 2510.17862 • Published Oct 15, 2025 • 7

commented a paper 4 months ago

When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?

Paper • 2510.17862 • Published Oct 15, 2025 • 7 •

upvoted a paper 5 months ago

Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

Paper • 2510.01161 • Published Oct 1, 2025 • 14

commented a paper 5 months ago

Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

Paper • 2510.01161 • Published Oct 1, 2025 • 14 •

upvoted 2 papers 9 months ago

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published Jun 11, 2025 • 55

Kinetics: Rethinking Test-Time Scaling Laws

Paper • 2506.05333 • Published Jun 5, 2025 • 6

updated a dataset 11 months ago

haizhongzheng/DAPO-Math-17K-cleaned

Viewer • Updated Mar 26, 2025 • 17.9k • 201 • 1

published a dataset 11 months ago

haizhongzheng/DAPO-Math-17K-cleaned

Viewer • Updated Mar 26, 2025 • 17.9k • 201 • 1

upvoted a paper 11 months ago

Harmful Terms and Where to Find Them: Measuring and Modeling Unfavorable Financial Terms and Conditions in Shopping Websites at Scale

Paper • 2502.01798 • Published Feb 3, 2025 • 1

published a model about 1 year ago

haizhongzheng/Qwen2.5-1.5B-Open-R1-GRPO

Updated Feb 10, 2025

updated a model over 1 year ago

haizhongzheng/Llama-3.2-1B-dpo-lora

Text Generation • Updated Nov 26, 2024 • 1

upvoted a paper over 1 year ago

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

Paper • 2406.09961 • Published Jun 14, 2024 • 55

Haizhong

AI & ML interests

Recent Activity

Organizations

haizhongzheng's activity

Forge: Scalable Agent RL Framework and Algorithm