vi-modernbert-ViHSD-ep20
This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.6519
- Micro F1: 87.5
- Micro Precision: 87.5
- Micro Recall: 87.5
- Macro F1: 68.1809
- Macro Precision: 70.5794
- Macro Recall: 66.4832
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 20.0
Training results
Training Loss | Epoch | Step | Validation Loss | Micro F1 | Micro Precision | Micro Recall | Macro F1 | Macro Precision | Macro Recall |
---|---|---|---|---|---|---|---|---|---|
1.2008 | 0.9980 | 375 | 0.3767 | 86.7141 | 86.7141 | 86.7141 | 59.5450 | 76.5138 | 54.2867 |
1.194 | 1.9980 | 750 | 0.3603 | 87.3503 | 87.3503 | 87.3503 | 68.2151 | 70.6976 | 66.1582 |
0.5185 | 2.9980 | 1125 | 0.5565 | 86.0030 | 86.0030 | 86.0030 | 67.9724 | 66.8065 | 69.3709 |
0.3474 | 3.9980 | 1500 | 0.6241 | 85.5539 | 85.5539 | 85.5539 | 65.7528 | 66.9509 | 66.2314 |
0.2354 | 4.9980 | 1875 | 0.6098 | 86.7889 | 86.7889 | 86.7889 | 64.2029 | 67.9233 | 61.4523 |
0.2857 | 5.9980 | 2250 | 0.8669 | 86.7889 | 86.7889 | 86.7889 | 67.2326 | 68.1950 | 67.1750 |
0.1326 | 6.9980 | 2625 | 0.7455 | 87.1257 | 87.1257 | 87.1257 | 68.0176 | 69.7401 | 67.3727 |
0.4085 | 7.9980 | 3000 | 0.9578 | 88.0240 | 88.0240 | 88.0240 | 67.6255 | 73.9917 | 64.5661 |
0.0273 | 8.9980 | 3375 | 1.5414 | 87.1257 | 87.1257 | 87.1257 | 64.8080 | 70.6655 | 61.2109 |
0.036 | 9.9980 | 3750 | 1.1192 | 87.5374 | 87.5374 | 87.5374 | 67.8021 | 71.5825 | 65.9373 |
0.0748 | 10.9980 | 4125 | 1.2999 | 87.3503 | 87.3503 | 87.3503 | 67.6266 | 70.4011 | 66.5444 |
0.0644 | 11.9980 | 4500 | 1.4459 | 87.5374 | 87.5374 | 87.5374 | 67.6215 | 71.4430 | 65.1796 |
0.0201 | 12.9980 | 4875 | 1.5466 | 87.6497 | 87.6497 | 87.6497 | 67.8941 | 71.5368 | 65.5838 |
0.01 | 13.9980 | 5250 | 1.5540 | 87.3877 | 87.3877 | 87.3877 | 68.4412 | 70.3740 | 67.0800 |
0.0439 | 14.9980 | 5625 | 1.5876 | 87.5749 | 87.5749 | 87.5749 | 68.6453 | 70.7319 | 67.0817 |
0.0628 | 15.9980 | 6000 | 1.6211 | 87.4626 | 87.4626 | 87.4626 | 67.8192 | 70.7271 | 65.8999 |
0.0 | 16.9980 | 6375 | 1.6364 | 87.5749 | 87.5749 | 87.5749 | 68.5724 | 70.7215 | 66.9734 |
0.0431 | 17.9980 | 6750 | 1.6461 | 87.5 | 87.5 | 87.5 | 68.1004 | 70.4974 | 66.3750 |
0.0094 | 18.9980 | 7125 | 1.6505 | 87.4626 | 87.4626 | 87.4626 | 68.0533 | 70.4170 | 66.3597 |
0.0174 | 19.9980 | 7500 | 1.6519 | 87.5 | 87.5 | 87.5 | 68.1809 | 70.5794 | 66.4832 |
Framework versions
- Transformers 4.50.0
- Pytorch 2.6.0+cu124
- Datasets 2.15.0
- Tokenizers 0.21.1
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support