YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen3-8B Turkish Reasoning Agent (QLoRA)

Repo: BossriZytn/Qwen3-8B_tr_reasoning

Model Overview

This repository contains a QLoRA fine-tuned adapter for Qwen3-8B, tailored for Turkish instruction-following and chain-of-thought reasoning on the MCP (turkish-general-reasoning-28k) dataset. The base model remains the original Qwen3-8B in 4-bit quantized format, with low-rank adapters applied to enhance performance on Turkish reasoning tasks.

Base model: Qwen3-8B (8B parameters) via Qwen/Qwen3-8B in 4-bit (bitsandbytes)
Adapter: LoRA (rank=8, α=32) on q_proj, v_proj, o_proj
Method: QLoRA (4-bit quant + LoRA)

Intended Uses & Limitations

Use cases:

Turkish Q&A with chain-of-thought
Instruction-following and step-by-step reasoning
Agentic reasoning modules in applications

Limitations:

Not for general text generation
May hallucinate or err in reasoning
Verbose outputs if budget not managed

Training Data

Dataset: ituperceptron/turkish-general-reasoning-28k
Size: ~28K examples
Split: 90% train / 10% validation

Fine-tuning

Hardware: NVIDIA A100 40GB
Quant: 4-bit (bnb_4bit_compute_dtype=torch.float16)
LoRA: rank=8, alpha=32, modules=q_proj,v_proj,o_proj
Training: 3 epochs; bs=2 (grad_accum=8); lr=5e-5; wd=0.01; warmup=5%; early stop=1

Eval

Val loss: ~1.20 CE (PPL≈3.5)
Gap: Train vs Val <0.01

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    'BossriZytn/Qwen3-8B_tr_reasoning', use_fast=True
)
# Base model (4-bit)
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16)
base = AutoModelForCausalLM.from_pretrained(
    'Qwen/Qwen3-8B', quantization_config=bnb, device_map='auto'
)
# Load adapter
model = PeftModel.from_pretrained(
    base, 'BossriZytn/Qwen3-8B_tr_reasoning',
    device_map='auto', torch_dtype=torch.float16
)
model.eval()

# Prompt
prompt = (
    '### Talimat:\n'
    'Ters dönmüş bir kaplumbağanın yardım çığlığını duyduğunda ne yapardın?\n'
    '### Cevap:\n'
)
inputs = tokenizer(prompt, return_tensors='pt').to(model.device)

# Generate
out = model.generate(
    **inputs,
    max_length=inputs.input_ids.size(1)+512,
    do_sample=False,
    eos_token_id=None,
    pad_token_id=tokenizer.pad_token_id
)
print(tokenizer.decode(out[0], skip_special_tokens=True))

TODO

Optimize this adapter as a “thinking” component for real-time agent scenarios
Implement more aggressive quantization and integrate ONNX/TensorRT to achieve low latency
Enhance model diversity through data augmentation with larger Turkish query sets
Develop an API layer for seamless integration into agent control loops

Ethical

May be biased
Human oversight needed

License

Apache-2.0

Cite

@misc{qwen3_turkish_qlora,
  title={Qwen3-8B Turkish Reasoning Agent (QLoRA)},
  author={BossriZytn},
  year={2025},
  howpublished={\url{https://huggingface.co/BossriZytn/Qwen3-8B_tr_reasoning}}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support