davidanugraha/DeepSeek-R1-Distill-Qwen-7B-Overthinking-SFT Text Generation • 8B • Updated 20 days ago • 8
davidanugraha/DeepSeek-R1-Distill-Qwen-1.5B-Overthinking-SFT Text Generation • 2B • Updated 20 days ago • 5
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Reflection-16k-20test-passrate 3B • Updated Dec 13, 2025
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Reflection-16k-20test-binary 3B • Updated Dec 13, 2025 • 1
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Reflection-8k-20test-binary 3B • Updated Dec 13, 2025 • 4
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Reflection-4k-20test-passrate 3B • Updated Dec 13, 2025
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Reflection-4k-20test-binary 3B • Updated Dec 13, 2025
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Baseline-16k-20test-passrate 3B • Updated Dec 13, 2025
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Baseline-8k-20test-passrate 3B • Updated Dec 13, 2025
davidanugraha/Qwen2.5-Coder-3B-Instruct-ReinfPP-Baseline-4k-20test-passrate 3B • Updated Dec 13, 2025