6 18 14

Jiawei Liu

ganler

https://jw-liu.xyz/

AI & ML interests

Simplifying the making of great software.

Recent Activity

upvoted a paper 3 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

upvoted an article 3 months ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

published a dataset 5 months ago

purpcode/ctxdistill-verified-ablation-Qwen2.5-14B-Instruct-1M-73k

View all activity

Organizations

upvoted a paper 3 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 36

upvoted an article 3 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

270

published 3 datasets 5 months ago

updated 2 datasets 5 months ago

purpcode/ctxdistill-verified-Qwen2.5-14B-Instruct-1M-57k

Viewer • Updated Aug 9, 2025 • 57.7k • 65

purpcode/ctxdistill-verified-Qwen2.5-32B-Instruct-55k

Viewer • Updated Aug 9, 2025 • 55.6k • 8

updated a Space 5 months ago

README

🦀

updated a collection 5 months ago

Paper

Collection

1 item • Updated Aug 5, 2025

updated a dataset 5 months ago

purpcode/ctxdistill-verified-ablation-Qwen2.5-14B-Instruct-1M-73k

Viewer • Updated Aug 5, 2025 • 74k • 7

updated a collection 5 months ago

PurpCode Models

Collection

4 items • Updated Aug 5, 2025

published a Space 5 months ago

README

🦀

published 2 models 5 months ago

purpcode/purpcode-14b-rule-sft

Text Generation • 15B • Updated Jul 31, 2025 • 2

purpcode/purpcode-32b-rule-sft

Text Generation • 33B • Updated Jul 31, 2025 • 3

updated 2 models 5 months ago

purpcode/purpcode-32b-rule-sft

Text Generation • 33B • Updated Jul 31, 2025 • 3

purpcode/purpcode-14b-rule-sft

Text Generation • 15B • Updated Jul 31, 2025 • 2

published a model 5 months ago

purpcode/purpcode-32b-rl

Text Generation • 33B • Updated Jul 31, 2025 • 7

Jiawei Liu

AI & ML interests

Recent Activity

Organizations

ganler's activity

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

README

README