GemMaroc: Unlocking Darija Proficiency in LLMs with Minimal Data
Paper
•
2505.17082
•
Published
This repository contains quantized versions of Qwen2.5-7B-Instruct-darija in GGUF format for efficient inference.
| Quantization | Description | File Size | Use Case |
|---|---|---|---|
f16 |
FP16 (no quantization) | 14531.95 MB | Best quality, largest size |
0 |
0 | 7723.36 MB | Quantized version |
# Download the desired quantization
wget https://huggingface.co/GemMaroc/Qwen2.5-7B-Instruct-darija-gguf/resolve/main/Qwen2.5-7B-Instruct-darija_ckpt-*_q8_0.gguf
# Run inference
./llama-cli -m Qwen2.5-7B-Instruct-darija_ckpt-*_q8_0.gguf -p "Your prompt here"
from llama_cpp import Llama
# Load the quantized model
llm = Llama(
model_path="./Qwen2.5-7B-Instruct-darija_ckpt-*_q8_0.gguf",
n_ctx=32768, # Context length
n_threads=8, # Number of CPU threads
)
# Generate text
response = llm("Your prompt here", max_tokens=512)
print(response['choices'][0]['text'])
f16 (largest file size)q8_0 (recommended)tq2_0 or tq1_0If you use this model, please cite the original GemMaroc paper:
@misc{skiredj2025gemmarocunlockingdarijaproficiency,
title={GemMaroc: Unlocking Darija Proficiency in LLMs with Minimal Data},
author={Abderrahman Skiredj and Ferdaous Azhari and Houdaifa Atou and Nouamane Tazi and Ismail Berrada},
year={2025},
eprint={2505.17082},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.17082},
}
8-bit
16-bit
Base model
Qwen/Qwen2.5-7B