Qwen3-GSM8K / README.md

Neural-Hacker

Update README.md

e171324 verified 2 months ago

preview code

raw

history blame contribute delete

1.81 kB

metadata

license: apache-2.0
datasets:
  - openai/gsm8k
language:
  - en
base_model:
  - Qwen/Qwen3-0.6B-Base
pipeline_tag: question-answering
library_name: transformers
tags:
  - maths
  - openai
  - gsm8k

Qwen3-GSM8K-LoRA

Qwen3-GSM8K-LoRA is a lightweight fine-tuned version of Qwen3-0.6B, adapted for multi-step mathematical reasoning on the GSM8K dataset. The model learns to produce explicit chain-of-thought reasoning followed by a numeric answer.

Model type: LoRA fine-tuned Qwen3-0.6B-base

Task: Mathematical reasoning and step-by-step problem solving

Base model: Qwen3-0.6B-base

Dataset: GSM8K (OpenAI)

Fine-tuning method: Low-Rank Adaptation (LoRA)

Training Details

Technique: LoRA fine-tuning (rank = 8, alpha = 16, dropout = 0.05)

Epochs: 3

Batch size: 2

Learning rate: 2e-4

Precision: bfloat16 / mixed

Evaluation [GSM8K (test = 1,319)]

Qwen3-0.6B (base): 33.39 %

Qwen3-GSM8K-LoRA: 35.41 %

Evaluation based on exact match of final numeric answers.

Limitations

This version includes preliminary results; further evaluation and dataset reproducibility code will be added.

May produce incorrect or verbose reasoning steps on complex multi-step problems.

Not intended for production or educational use without verification.

License

Apache 2.0