vi-modernbert-ViHSD-ep20

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6519
  • Micro F1: 87.5
  • Micro Precision: 87.5
  • Micro Recall: 87.5
  • Macro F1: 68.1809
  • Macro Precision: 70.5794
  • Macro Recall: 66.4832

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Micro F1 Micro Precision Micro Recall Macro F1 Macro Precision Macro Recall
1.2008 0.9980 375 0.3767 86.7141 86.7141 86.7141 59.5450 76.5138 54.2867
1.194 1.9980 750 0.3603 87.3503 87.3503 87.3503 68.2151 70.6976 66.1582
0.5185 2.9980 1125 0.5565 86.0030 86.0030 86.0030 67.9724 66.8065 69.3709
0.3474 3.9980 1500 0.6241 85.5539 85.5539 85.5539 65.7528 66.9509 66.2314
0.2354 4.9980 1875 0.6098 86.7889 86.7889 86.7889 64.2029 67.9233 61.4523
0.2857 5.9980 2250 0.8669 86.7889 86.7889 86.7889 67.2326 68.1950 67.1750
0.1326 6.9980 2625 0.7455 87.1257 87.1257 87.1257 68.0176 69.7401 67.3727
0.4085 7.9980 3000 0.9578 88.0240 88.0240 88.0240 67.6255 73.9917 64.5661
0.0273 8.9980 3375 1.5414 87.1257 87.1257 87.1257 64.8080 70.6655 61.2109
0.036 9.9980 3750 1.1192 87.5374 87.5374 87.5374 67.8021 71.5825 65.9373
0.0748 10.9980 4125 1.2999 87.3503 87.3503 87.3503 67.6266 70.4011 66.5444
0.0644 11.9980 4500 1.4459 87.5374 87.5374 87.5374 67.6215 71.4430 65.1796
0.0201 12.9980 4875 1.5466 87.6497 87.6497 87.6497 67.8941 71.5368 65.5838
0.01 13.9980 5250 1.5540 87.3877 87.3877 87.3877 68.4412 70.3740 67.0800
0.0439 14.9980 5625 1.5876 87.5749 87.5749 87.5749 68.6453 70.7319 67.0817
0.0628 15.9980 6000 1.6211 87.4626 87.4626 87.4626 67.8192 70.7271 65.8999
0.0 16.9980 6375 1.6364 87.5749 87.5749 87.5749 68.5724 70.7215 66.9734
0.0431 17.9980 6750 1.6461 87.5 87.5 87.5 68.1004 70.4974 66.3750
0.0094 18.9980 7125 1.6505 87.4626 87.4626 87.4626 68.0533 70.4170 66.3597
0.0174 19.9980 7500 1.6519 87.5 87.5 87.5 68.1809 70.5794 66.4832

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.6.0+cu124
  • Datasets 2.15.0
  • Tokenizers 0.21.1
Downloads last month
9
Safetensors
Model size
150M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support