morbi-v023-cfa-l1

This model is a fine-tuned version of mistralai/Mistral-Small-Instruct-2409 on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1882

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 50
training_steps: 5000

Training results

Training Loss	Epoch	Step	Validation Loss
0.7649	1.5576	250	0.1310
0.7263	3.1153	500	0.1407
0.7224	4.6729	750	0.1531
0.7107	6.2305	1000	0.1488
0.7161	7.7882	1250	0.1568
0.7022	9.3458	1500	0.1623
0.7111	10.9034	1750	0.1687
0.7025	12.4611	2000	0.1708
0.6921	14.0187	2250	0.1692
0.6957	15.5763	2500	0.1741
0.6884	17.1340	2750	0.1754
0.7043	18.6916	3000	0.1754
0.6892	20.2492	3250	0.1823
0.6861	21.8069	3500	0.1814
0.6852	23.3645	3750	0.1840
0.6836	24.9221	4000	0.1858
0.6833	26.4798	4250	0.1868
0.6833	28.0374	4500	0.1876
0.696	29.5950	4750	0.1878
0.6853	31.1526	5000	0.1882

Framework versions

PEFT 0.13.0
Transformers 4.46.0
Pytorch 2.4.1+cu124
Datasets 4.5.0
Tokenizers 0.20.3

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for h3ir/morbi-v023-cfa-l1

Base model

mistralai/Mistral-Small-Instruct-2409

Adapter

(12)

this model