Qwen3-GSM8K-LoRA
Qwen3-GSM8K-LoRA is a lightweight fine-tuned version of Qwen3-0.6B, adapted for multi-step mathematical reasoning on the GSM8K dataset. The model learns to produce explicit chain-of-thought reasoning followed by a numeric answer.
Model type: LoRA fine-tuned Qwen3-0.6B-base
Task: Mathematical reasoning and step-by-step problem solving
Base model: Qwen3-0.6B-base
Dataset: GSM8K (OpenAI)
Fine-tuning method: Low-Rank Adaptation (LoRA)
Training Details
Technique: LoRA fine-tuning (rank = 8, alpha = 16, dropout = 0.05)
Epochs: 3
Batch size: 2
Learning rate: 2e-4
Precision: bfloat16 / mixed
Evaluation [GSM8K (test = 1,319)]
Qwen3-0.6B (base): 33.39 %
Qwen3-GSM8K-LoRA: 35.41 %
Evaluation based on exact match of final numeric answers.
Limitations
This version includes preliminary results; further evaluation and dataset reproducibility code will be added.
May produce incorrect or verbose reasoning steps on complex multi-step problems.
Not intended for production or educational use without verification.
License
Apache 2.0
Model tree for Neural-Hacker/Qwen3-GSM8K
Base model
Qwen/Qwen3-0.6B-Base