Llama-3.2-1B-Instruct - Renesas X5H
Introduction
This repository contains Llama-3.2-1B-Instruct model, optimized for Renesas X5H platform for text inference.
- Model Architecture: Llama 3.2-1B is an auto-regressive language model that uses an optimized transformer architecture.
- Source Model: meta-llama/Llama-3.2-1B-Instruct
Performance
The following performance metrics were measured with a prompt.
| Model | Precision | Device | Response Rate (tokens/sec) |
|---|---|---|---|
| Llama-3.2-1B-Instruct | F16 | X5H - Single Cluster NPX | 16.7 tokens/sec |
Prerequisites
To run model, you need:
- Renesas X5H Board
- llama-runner CLI:
- Hugging Face CLI: For downloading the model.
Deployment
Copy the binary and model to one single folder
<PATH_ON_BOARD>
โโโ llama-runner
โโโ Llama-3.2-1B-Instruct-f16.gguf
Inference
./llama-runner "prompt"
- Downloads last month
- 13
Hardware compatibility
Log In to add your hardware
16-bit
Model tree for Renesas/Llama-3.2-1B-Instruct
Base model
meta-llama/Llama-3.2-1B-Instruct