Qwen3-GSM8K-LoRA

Qwen3-GSM8K-LoRA is a lightweight fine-tuned version of Qwen3-0.6B, adapted for multi-step mathematical reasoning on the GSM8K dataset. The model learns to produce explicit chain-of-thought reasoning followed by a numeric answer.


Model type: LoRA fine-tuned Qwen3-0.6B-base

Task: Mathematical reasoning and step-by-step problem solving

Base model: Qwen3-0.6B-base

Dataset: GSM8K (OpenAI)

Fine-tuning method: Low-Rank Adaptation (LoRA)


Training Details

Technique: LoRA fine-tuning (rank = 8, alpha = 16, dropout = 0.05)

Epochs: 3

Batch size: 2

Learning rate: 2e-4

Precision: bfloat16 / mixed


Evaluation [GSM8K (test = 1,319)]

Qwen3-0.6B (base): 33.39 %

Qwen3-GSM8K-LoRA: 35.41 %

Evaluation based on exact match of final numeric answers.


Limitations

This version includes preliminary results; further evaluation and dataset reproducibility code will be added.

May produce incorrect or verbose reasoning steps on complex multi-step problems.

Not intended for production or educational use without verification.


License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Neural-Hacker/Qwen3-GSM8K

Finetuned
(466)
this model

Dataset used to train Neural-Hacker/Qwen3-GSM8K