Levers Base Najdi Conversational Model (70B-IT-Merged)
- Developed by: uselevers
- Model type: Causal Language Model (Merged)
- Language(s): Arabic (Najdi dialect), English
- License: Apache 2.0
- Parameters: 70B
- Model Version: Merged instruction-tuned variant
Model Description
This is a merged 70B parameter language model specifically optimized for Najdi dialect conversational tasks. The model has been fine-tuned on a proprietary dataset containing 133 hours of authentic Najdi conversational data and subsequently merged for optimal inference performance. It excels at understanding and generating natural dialogue in the Najdi dialect.
Dataset
Levers 133 Hours Najdi Conversational Dataset (Proprietary)
- Total Conversations: 4,023 rows
- Conversation Length: Minimum 5 turns per conversation
- Total Duration: 133 hours of conversational data
- Language: Najdi Arabic dialect
- Type: Multi-turn conversational data
- Quality: High-quality, authentic dialogues capturing natural Najdi speech patterns
This proprietary dataset ensures the model can handle extended conversational contexts and maintain coherent dialogue across multiple turns.
Training Details
Training Framework
This model was trained 2x faster using:
- Unsloth - Optimized training framework
- Hugging Face's TRL (Transformer Reinforcement Learning) library
- 4-bit quantization (BNB) for efficient training
Training & Merging Configuration
- Base Model: 70B Instruct model (4-bit quantized)
- Training Method: Supervised Fine-tuning with LoRA
- Merging Method: LoRA adapters merged back into base model
- Final Model: Full precision merged weights for optimal inference
The merged model combines the LoRA adapters with the base model weights, resulting in a single unified model without the need for adapter loading during inference.
Intended Use
Primary Use Cases
- Najdi dialect conversational AI
- Multi-turn dialogue systems
- Arabic dialect-specific chatbots
- Conversational assistants for Najdi-speaking regions
- Cultural and linguistic preservation applications
- Production deployment requiring fast inference
Out-of-Scope Uses
- Tasks requiring formal Modern Standard Arabic (MSA) without Najdi dialect considerations
- Real-time critical decision-making systems
- Applications where dialect-specific nuances are not important
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "uselevers/levers-base-najdi-70b-it-merged"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map="auto",
torch_dtype="auto"
)
# Example conversation
messages = [
{"role": "user", "content": "Your message here in Najdi dialect"}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Model Advantages
- No Adapter Loading: Merged weights eliminate the need for LoRA adapter loading
- Faster Inference: Optimized for production deployment
- Simplified Deployment: Single model file structure
- Full Precision: Benefits from merged full-precision weights
Limitations
- This model is specifically optimized for Najdi dialect and may not perform as well on other Arabic dialects
- Performance may vary on topics not well-represented in the training dataset
- As with all language models, it may occasionally generate incorrect or biased information
- The model's knowledge is limited to the training data cutoff date
- Larger model size requires adequate hardware resources (recommended: 80GB+ VRAM or multi-GPU setup)
Ethical Considerations
- This model is trained on proprietary conversational data
- Users should be aware of potential biases present in conversational AI systems
- The model should not be used for generating harmful, misleading, or inappropriate content
- Proper attribution should be given when using this model in applications
Performance
The model demonstrates strong performance on:
- Multi-turn Najdi dialect conversations
- Maintaining context across extended dialogues (5+ turns)
- Natural language understanding in colloquial Najdi Arabic
- Code-switching between Najdi Arabic and English
- Consistent inference speed due to merged architecture
Hardware Requirements
Minimum Requirements
- GPU VRAM: 80GB+ (single GPU) or multi-GPU setup
- RAM: 128GB+ system RAM recommended
- Storage: ~140GB for model weights
Recommended Setup
- A100 80GB or H100 80GB GPU
- Multi-GPU setup for faster inference (2x A100 40GB or similar)
Citation
If you use this model in your research or applications, please cite:
@misc{levers-base-najdi-70b-it-merged,
author = {uselevers},
title = {Levers Base Najdi Conversational Model (70B-IT-Merged)},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/uselevers/levers-base-najdi-70b-it-merged}}
}
Acknowledgments
This model was trained using:
- Unsloth for accelerated training
- Hugging Face TRL library
Contact
For questions, issues, or collaboration opportunities, please contact uselevers.
License
This model is released under the Apache 2.0 license.
- Downloads last month
- 176
