YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen3-8B Turkish Reasoning Agent (QLoRA)

Repo: BossriZytn/Qwen3-8B_tr_reasoning

Model Overview

This repository contains a QLoRA fine-tuned adapter for Qwen3-8B, tailored for Turkish instruction-following and chain-of-thought reasoning on the MCP (turkish-general-reasoning-28k) dataset. The base model remains the original Qwen3-8B in 4-bit quantized format, with low-rank adapters applied to enhance performance on Turkish reasoning tasks.

  • Base model: Qwen3-8B (8B parameters) via Qwen/Qwen3-8B in 4-bit (bitsandbytes)
  • Adapter: LoRA (rank=8, α=32) on q_proj, v_proj, o_proj
  • Method: QLoRA (4-bit quant + LoRA)

Intended Uses & Limitations

Use cases:

  • Turkish Q&A with chain-of-thought
  • Instruction-following and step-by-step reasoning
  • Agentic reasoning modules in applications

Limitations:

  • Not for general text generation
  • May hallucinate or err in reasoning
  • Verbose outputs if budget not managed

Training Data

  • Dataset: ituperceptron/turkish-general-reasoning-28k
  • Size: ~28K examples
  • Split: 90% train / 10% validation

Fine-tuning

  • Hardware: NVIDIA A100 40GB
  • Quant: 4-bit (bnb_4bit_compute_dtype=torch.float16)
  • LoRA: rank=8, alpha=32, modules=q_proj,v_proj,o_proj
  • Training: 3 epochs; bs=2 (grad_accum=8); lr=5e-5; wd=0.01; warmup=5%; early stop=1

Eval

  • Val loss: ~1.20 CE (PPL≈3.5)
  • Gap: Train vs Val <0.01

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    'BossriZytn/Qwen3-8B_tr_reasoning', use_fast=True
)
# Base model (4-bit)
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16)
base = AutoModelForCausalLM.from_pretrained(
    'Qwen/Qwen3-8B', quantization_config=bnb, device_map='auto'
)
# Load adapter
model = PeftModel.from_pretrained(
    base, 'BossriZytn/Qwen3-8B_tr_reasoning',
    device_map='auto', torch_dtype=torch.float16
)
model.eval()

# Prompt
prompt = (
    '### Talimat:\n'
    'Ters dönmüş bir kaplumbağanın yardım çığlığını duyduğunda ne yapardın?\n'
    '### Cevap:\n'
)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)

# Generate
out = model.generate(
    **inputs,
    max_length=inputs.input_ids.size(1)+512,
    do_sample=False,
    eos_token_id=None,
    pad_token_id=tokenizer.pad_token_id
)
print(tokenizer.decode(out[0], skip_special_tokens=True))

TODO

  • Optimize this adapter as a “thinking” component for real-time agent scenarios

  • Implement more aggressive quantization and integrate ONNX/TensorRT to achieve low latency

  • Enhance model diversity through data augmentation with larger Turkish query sets

  • Develop an API layer for seamless integration into agent control loops

Ethical

  • May be biased
  • Human oversight needed

License

Apache-2.0

Cite

@misc{qwen3_turkish_qlora,
  title={Qwen3-8B Turkish Reasoning Agent (QLoRA)},
  author={BossriZytn},
  year={2025},
  howpublished={\url{https://huggingface.co/BossriZytn/Qwen3-8B_tr_reasoning}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support