mini_520m_1 / README.md
KaiyueWen's picture
Upload folder using huggingface_hub
1c23e21 verified

Model Card

Best configuration

Hyperparameter Value
beta1 0.9
beta2 0.98
epsilon 1e-10
learning_rate 0.004
max_grad_norm 0
min_lr_ratio 0
nesterov False
train_batch_size 128
warmup 4000
weight_decay 0.1