You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Levers Base Najdi Conversational Model (70B-IT-Merged)

  • Developed by: uselevers
  • Model type: Causal Language Model (Merged)
  • Language(s): Arabic (Najdi dialect), English
  • License: Apache 2.0
  • Parameters: 70B
  • Model Version: Merged instruction-tuned variant

Model Description

This is a merged 70B parameter language model specifically optimized for Najdi dialect conversational tasks. The model has been fine-tuned on a proprietary dataset containing 133 hours of authentic Najdi conversational data and subsequently merged for optimal inference performance. It excels at understanding and generating natural dialogue in the Najdi dialect.

Dataset

Levers 133 Hours Najdi Conversational Dataset (Proprietary)

  • Total Conversations: 4,023 rows
  • Conversation Length: Minimum 5 turns per conversation
  • Total Duration: 133 hours of conversational data
  • Language: Najdi Arabic dialect
  • Type: Multi-turn conversational data
  • Quality: High-quality, authentic dialogues capturing natural Najdi speech patterns

This proprietary dataset ensures the model can handle extended conversational contexts and maintain coherent dialogue across multiple turns.

Training Details

Training Framework

This model was trained 2x faster using:

  • Unsloth - Optimized training framework
  • Hugging Face's TRL (Transformer Reinforcement Learning) library
  • 4-bit quantization (BNB) for efficient training

Training & Merging Configuration

  • Base Model: 70B Instruct model (4-bit quantized)
  • Training Method: Supervised Fine-tuning with LoRA
  • Merging Method: LoRA adapters merged back into base model
  • Final Model: Full precision merged weights for optimal inference

The merged model combines the LoRA adapters with the base model weights, resulting in a single unified model without the need for adapter loading during inference.

Intended Use

Primary Use Cases

  • Najdi dialect conversational AI
  • Multi-turn dialogue systems
  • Arabic dialect-specific chatbots
  • Conversational assistants for Najdi-speaking regions
  • Cultural and linguistic preservation applications
  • Production deployment requiring fast inference

Out-of-Scope Uses

  • Tasks requiring formal Modern Standard Arabic (MSA) without Najdi dialect considerations
  • Real-time critical decision-making systems
  • Applications where dialect-specific nuances are not important

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "uselevers/levers-base-najdi-70b-it-merged"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="auto"
)

# Example conversation
messages = [
    {"role": "user", "content": "Your message here in Najdi dialect"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Model Advantages

  • No Adapter Loading: Merged weights eliminate the need for LoRA adapter loading
  • Faster Inference: Optimized for production deployment
  • Simplified Deployment: Single model file structure
  • Full Precision: Benefits from merged full-precision weights

Limitations

  • This model is specifically optimized for Najdi dialect and may not perform as well on other Arabic dialects
  • Performance may vary on topics not well-represented in the training dataset
  • As with all language models, it may occasionally generate incorrect or biased information
  • The model's knowledge is limited to the training data cutoff date
  • Larger model size requires adequate hardware resources (recommended: 80GB+ VRAM or multi-GPU setup)

Ethical Considerations

  • This model is trained on proprietary conversational data
  • Users should be aware of potential biases present in conversational AI systems
  • The model should not be used for generating harmful, misleading, or inappropriate content
  • Proper attribution should be given when using this model in applications

Performance

The model demonstrates strong performance on:

  • Multi-turn Najdi dialect conversations
  • Maintaining context across extended dialogues (5+ turns)
  • Natural language understanding in colloquial Najdi Arabic
  • Code-switching between Najdi Arabic and English
  • Consistent inference speed due to merged architecture

Hardware Requirements

Minimum Requirements

  • GPU VRAM: 80GB+ (single GPU) or multi-GPU setup
  • RAM: 128GB+ system RAM recommended
  • Storage: ~140GB for model weights

Recommended Setup

  • A100 80GB or H100 80GB GPU
  • Multi-GPU setup for faster inference (2x A100 40GB or similar)

Citation

If you use this model in your research or applications, please cite:

@misc{levers-base-najdi-70b-it-merged,
  author = {uselevers},
  title = {Levers Base Najdi Conversational Model (70B-IT-Merged)},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/uselevers/levers-base-najdi-70b-it-merged}}
}

Acknowledgments

This model was trained using:

Contact

For questions, issues, or collaboration opportunities, please contact uselevers.

License

This model is released under the Apache 2.0 license.

Downloads last month
176
Safetensors
Model size
71B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including uselevers/levers-base-najdi-70b-it-merged