ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
RoadMa
RoadQAQ
AI & ML interests
None yet
Organizations
models 8
RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero
Question Answering • 8B • Updated • 13
RoadQAQ/Qwen2.5-7B-think
Text Generation • 8B • Updated • 3
RoadQAQ/Qwen2.5-Math-1.5B-16k-think
Text Generation • 2B • Updated • 2.05k •
RoadQAQ/ReLIFT-Qwen2.5-7B-Zero
Question Answering • 8B • Updated • 9 • 2
RoadQAQ/Qwen2.5-Math-7B-16k-think
Text Generation • 8B • Updated • 1.98k
RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero
Question Answering • 2B • Updated • 11
RoadQAQ/OpenR1-Distill-7B
Updated
RoadQAQ/video_llm_template
Updated