You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Llama-3.2-1B-Instruct - Renesas X5H

Introduction

This repository contains Llama-3.2-1B-Instruct model, optimized for Renesas X5H platform for text inference.

  • Model Architecture: Llama 3.2-1B is an auto-regressive language model that uses an optimized transformer architecture.
  • Source Model: meta-llama/Llama-3.2-1B-Instruct

Performance

The following performance metrics were measured with a prompt.

Model Precision Device Response Rate (tokens/sec)
Llama-3.2-1B-Instruct F16 X5H - Single Cluster NPX 16.7 tokens/sec

Prerequisites

To run model, you need:

  1. Renesas X5H Board
  2. llama-runner CLI:
  3. Hugging Face CLI: For downloading the model.

Deployment

Copy the binary and model to one single folder

<PATH_ON_BOARD>
โ”œโ”€โ”€ llama-runner
โ”œโ”€โ”€ Llama-3.2-1B-Instruct-f16.gguf

Inference

./llama-runner "prompt"
Downloads last month
13
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Renesas/Llama-3.2-1B-Instruct

Quantized
(353)
this model