morbi-v023-cfa-l1

This model is a fine-tuned version of mistralai/Mistral-Small-Instruct-2409 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1882

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • training_steps: 5000

Training results

Training Loss Epoch Step Validation Loss
0.7649 1.5576 250 0.1310
0.7263 3.1153 500 0.1407
0.7224 4.6729 750 0.1531
0.7107 6.2305 1000 0.1488
0.7161 7.7882 1250 0.1568
0.7022 9.3458 1500 0.1623
0.7111 10.9034 1750 0.1687
0.7025 12.4611 2000 0.1708
0.6921 14.0187 2250 0.1692
0.6957 15.5763 2500 0.1741
0.6884 17.1340 2750 0.1754
0.7043 18.6916 3000 0.1754
0.6892 20.2492 3250 0.1823
0.6861 21.8069 3500 0.1814
0.6852 23.3645 3750 0.1840
0.6836 24.9221 4000 0.1858
0.6833 26.4798 4250 0.1868
0.6833 28.0374 4500 0.1876
0.696 29.5950 4750 0.1878
0.6853 31.1526 5000 0.1882

Framework versions

  • PEFT 0.13.0
  • Transformers 4.46.0
  • Pytorch 2.4.1+cu124
  • Datasets 4.5.0
  • Tokenizers 0.20.3
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for h3ir/morbi-v023-cfa-l1

Adapter
(12)
this model