modernbert-small-amharic-50k

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7976
  • Model Preparation Time: 0.0019

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 36
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
10.2307 0.1249 994 9.3293 0.0019
8.7825 0.2498 1988 8.4617 0.0019
8.2383 0.3747 2982 8.0052 0.0019
7.725 0.4996 3976 7.4308 0.0019
7.1469 0.6245 4970 6.9048 0.0019
6.6885 0.7493 5964 6.4796 0.0019
6.2917 0.8742 6958 6.1138 0.0019
5.9443 0.9991 7952 5.7647 0.0019
5.6083 1.1240 8946 5.4637 0.0019
5.3113 1.2489 9940 5.1734 0.0019
5.0372 1.3738 10934 4.8990 0.0019
4.8082 1.4987 11928 4.6959 0.0019
4.6196 1.6236 12922 4.5390 0.0019
4.4694 1.7485 13916 4.4021 0.0019
4.3445 1.8734 14910 4.2839 0.0019
4.2434 1.9982 15904 4.1847 0.0019
4.1483 2.1231 16898 4.1198 0.0019
4.0722 2.2480 17892 4.0459 0.0019
4.0129 2.3729 18886 3.9963 0.0019
3.9537 2.4978 19880 3.9433 0.0019
3.9087 2.6227 20874 3.8674 0.0019
3.8619 2.7476 21868 3.8329 0.0019
3.82 2.8725 22862 3.8060 0.0019
3.784 2.9974 23856 3.7679 0.0019
3.7371 3.1223 24850 3.7382 0.0019
3.7107 3.2471 25844 3.7112 0.0019
3.6832 3.3720 26838 3.6823 0.0019
3.6592 3.4969 27832 3.6502 0.0019
3.6316 3.6218 28826 3.6265 0.0019
3.6092 3.7467 29820 3.5966 0.0019
3.5879 3.8716 30814 3.5839 0.0019
3.5702 3.9965 31808 3.5615 0.0019
3.5328 4.1214 32802 3.5439 0.0019
3.519 4.2463 33796 3.5269 0.0019
3.5046 4.3712 34790 3.5224 0.0019
3.4893 4.4960 35784 3.4990 0.0019
3.4756 4.6209 36778 3.4717 0.0019
3.4595 4.7458 37772 3.4651 0.0019
3.4429 4.8707 38766 3.4466 0.0019
3.4325 4.9956 39760 3.4381 0.0019
3.4031 5.1205 40754 3.4205 0.0019
3.3937 5.2454 41748 3.4100 0.0019
3.3855 5.3703 42742 3.4019 0.0019
3.3738 5.4952 43736 3.3809 0.0019
3.3628 5.6201 44730 3.3762 0.0019
3.3568 5.7449 45724 3.3613 0.0019
3.3451 5.8698 46718 3.3598 0.0019
3.3342 5.9947 47712 3.3545 0.0019
3.3105 6.1196 48706 3.3286 0.0019
3.3049 6.2445 49700 3.3312 0.0019
3.2964 6.3694 50694 3.3070 0.0019
3.2874 6.4943 51688 3.3043 0.0019
3.2798 6.6192 52682 3.3015 0.0019
3.2777 6.7441 53676 3.2870 0.0019
3.2713 6.8690 54670 3.2826 0.0019
3.2649 6.9938 55664 3.2819 0.0019
3.2362 7.1187 56658 3.2672 0.0019
3.2367 7.2436 57652 3.2611 0.0019
3.2267 7.3685 58646 3.2594 0.0019
3.2232 7.4934 59640 3.2483 0.0019
3.2192 7.6183 60634 3.2313 0.0019
3.2131 7.7432 61628 3.2383 0.0019
3.2067 7.8681 62622 3.2222 0.0019
3.1996 7.9930 63616 3.2285 0.0019
3.1798 8.1179 64610 3.2162 0.0019
3.1779 8.2427 65604 3.2078 0.0019
3.1729 8.3676 66598 3.2018 0.0019
3.1716 8.4925 67592 3.1950 0.0019
3.1664 8.6174 68586 3.1911 0.0019
3.1638 8.7423 69580 3.1862 0.0019
3.155 8.8672 70574 3.1658 0.0019
3.1534 8.9921 71568 3.1746 0.0019
3.1319 9.1170 72562 3.1710 0.0019
3.1294 9.2419 73556 3.1667 0.0019
3.1279 9.3668 74550 3.1557 0.0019
3.1258 9.4916 75544 3.1608 0.0019
3.1193 9.6165 76538 3.1467 0.0019
3.1158 9.7414 77532 3.1522 0.0019
3.1168 9.8663 78526 3.1391 0.0019
3.1105 9.9912 79520 3.1386 0.0019
3.0906 10.1161 80514 3.1351 0.0019
3.0887 10.2410 81508 3.1179 0.0019
3.0904 10.3659 82502 3.1275 0.0019
3.091 10.4908 83496 3.1180 0.0019
3.0823 10.6157 84490 3.1022 0.0019
3.0845 10.7405 85484 3.1070 0.0019
3.0808 10.8654 86478 3.0950 0.0019
3.0753 10.9903 87472 3.0968 0.0019
3.0597 11.1152 88466 3.0999 0.0019
3.0571 11.2401 89460 3.0939 0.0019
3.055 11.3650 90454 3.0904 0.0019
3.0547 11.4899 91448 3.0903 0.0019
3.0491 11.6148 92442 3.0851 0.0019
3.0513 11.7397 93436 3.0793 0.0019
3.0465 11.8646 94430 3.0781 0.0019
3.0469 11.9894 95424 3.0749 0.0019
3.0239 12.1143 96418 3.0681 0.0019
3.0262 12.2392 97412 3.0676 0.0019
3.0231 12.3641 98406 3.0595 0.0019
3.0236 12.4890 99400 3.0584 0.0019
3.0213 12.6139 100394 3.0597 0.0019
3.0221 12.7388 101388 3.0577 0.0019
3.0209 12.8637 102382 3.0504 0.0019
3.0185 12.9886 103376 3.0434 0.0019
2.9991 13.1135 104370 3.0419 0.0019
3.0034 13.2383 105364 3.0428 0.0019
3.0004 13.3632 106358 3.0412 0.0019
2.9944 13.4881 107352 3.0298 0.0019
2.9932 13.6130 108346 3.0311 0.0019
2.9937 13.7379 109340 3.0212 0.0019
2.9945 13.8628 110334 3.0307 0.0019
2.9929 13.9877 111328 3.0241 0.0019
2.9757 14.1126 112322 3.0302 0.0019
2.9771 14.2375 113316 3.0166 0.0019
2.9743 14.3624 114310 3.0190 0.0019
2.9752 14.4872 115304 3.0130 0.0019
2.9727 14.6121 116298 3.0035 0.0019
2.9728 14.7370 117292 3.0103 0.0019
2.9731 14.8619 118286 3.0071 0.0019
2.9717 14.9868 119280 3.0007 0.0019
2.9564 15.1117 120274 3.0101 0.0019
2.9564 15.2366 121268 2.9995 0.0019
2.9557 15.3615 122262 3.0046 0.0019
2.9527 15.4864 123256 2.9944 0.0019
2.9541 15.6113 124250 2.9918 0.0019
2.9534 15.7361 125244 3.0015 0.0019
2.9517 15.8610 126238 2.9855 0.0019
2.9505 15.9859 127232 2.9874 0.0019
2.9361 16.1108 128226 2.9721 0.0019
2.9387 16.2357 129220 2.9857 0.0019
2.9354 16.3606 130214 2.9778 0.0019
2.9373 16.4855 131208 2.9806 0.0019
2.9347 16.6104 132202 2.9793 0.0019
2.9326 16.7353 133196 2.9704 0.0019
2.9347 16.8602 134190 2.9709 0.0019
2.9305 16.9850 135184 2.9745 0.0019
2.9187 17.1099 136178 2.9763 0.0019
2.9176 17.2348 137172 2.9663 0.0019
2.9156 17.3597 138166 2.9625 0.0019
2.9182 17.4846 139160 2.9581 0.0019
2.9186 17.6095 140154 2.9585 0.0019
2.9149 17.7344 141148 2.9621 0.0019
2.917 17.8593 142142 2.9569 0.0019
2.9122 17.9842 143136 2.9544 0.0019
2.9016 18.1091 144130 2.9598 0.0019
2.8994 18.2339 145124 2.9532 0.0019
2.904 18.3588 146118 2.9590 0.0019
2.9015 18.4837 147112 2.9524 0.0019
2.9011 18.6086 148106 2.9535 0.0019
2.9002 18.7335 149100 2.9498 0.0019
2.8983 18.8584 150094 2.9408 0.0019
2.899 18.9833 151088 2.9387 0.0019
2.8878 19.1082 152082 2.9317 0.0019
2.8849 19.2331 153076 2.9355 0.0019
2.886 19.3580 154070 2.9378 0.0019
2.8848 19.4828 155064 2.9344 0.0019
2.8837 19.6077 156058 2.9335 0.0019
2.8842 19.7326 157052 2.9321 0.0019
2.8832 19.8575 158046 2.9322 0.0019
2.8841 19.9824 159040 2.9251 0.0019
2.8715 20.1073 160034 2.9305 0.0019
2.8712 20.2322 161028 2.9291 0.0019
2.8706 20.3571 162022 2.9259 0.0019
2.8714 20.4820 163016 2.9275 0.0019
2.8723 20.6069 164010 2.9176 0.0019
2.8728 20.7318 165004 2.9074 0.0019
2.8703 20.8566 165998 2.9127 0.0019
2.8707 20.9815 166992 2.9213 0.0019
2.8561 21.1064 167986 2.9116 0.0019
2.858 21.2313 168980 2.9084 0.0019
2.8613 21.3562 169974 2.9110 0.0019
2.8561 21.4811 170968 2.9085 0.0019
2.8577 21.6060 171962 2.9134 0.0019
2.8582 21.7309 172956 2.9127 0.0019
2.8563 21.8558 173950 2.9101 0.0019
2.8605 21.9807 174944 2.9085 0.0019
2.8449 22.1055 175938 2.8995 0.0019
2.8457 22.2304 176932 2.8989 0.0019
2.8426 22.3553 177926 2.8985 0.0019
2.8451 22.4802 178920 2.8988 0.0019
2.8449 22.6051 179914 2.9131 0.0019
2.8427 22.7300 180908 2.8996 0.0019
2.8468 22.8549 181902 2.9005 0.0019
2.8446 22.9798 182896 2.8960 0.0019
2.834 23.1047 183890 2.8958 0.0019
2.8343 23.2296 184884 2.8887 0.0019
2.833 23.3544 185878 2.8902 0.0019
2.8333 23.4793 186872 2.8930 0.0019
2.8313 23.6042 187866 2.8822 0.0019
2.8346 23.7291 188860 2.8866 0.0019
2.8308 23.8540 189854 2.8788 0.0019
2.8316 23.9789 190848 2.8791 0.0019
2.823 24.1038 191842 2.8787 0.0019
2.8216 24.2287 192836 2.8744 0.0019
2.822 24.3536 193830 2.8790 0.0019
2.8213 24.4785 194824 2.8780 0.0019
2.8199 24.6033 195818 2.8717 0.0019
2.8192 24.7282 196812 2.8791 0.0019
2.8187 24.8531 197806 2.8674 0.0019
2.821 24.9780 198800 2.8718 0.0019
2.8084 25.1029 199794 2.8719 0.0019
2.8095 25.2278 200788 2.8595 0.0019
2.8097 25.3527 201782 2.8686 0.0019
2.8129 25.4776 202776 2.8693 0.0019
2.8083 25.6025 203770 2.8683 0.0019
2.8101 25.7274 204764 2.8546 0.0019
2.8107 25.8522 205758 2.8725 0.0019
2.8089 25.9771 206752 2.8745 0.0019
2.7992 26.1020 207746 2.8679 0.0019
2.7995 26.2269 208740 2.8627 0.0019
2.7982 26.3518 209734 2.8672 0.0019
2.8 26.4767 210728 2.8580 0.0019
2.8029 26.6016 211722 2.8498 0.0019
2.7998 26.7265 212716 2.8551 0.0019
2.8002 26.8514 213710 2.8506 0.0019
2.7998 26.9763 214704 2.8568 0.0019
2.7931 27.1011 215698 2.8491 0.0019
2.7919 27.2260 216692 2.8544 0.0019
2.7873 27.3509 217686 2.8524 0.0019
2.7892 27.4758 218680 2.8468 0.0019
2.7943 27.6007 219674 2.8459 0.0019
2.7933 27.7256 220668 2.8488 0.0019
2.7875 27.8505 221662 2.8393 0.0019
2.7902 27.9754 222656 2.8350 0.0019
2.7834 28.1003 223650 2.8437 0.0019
2.7816 28.2252 224644 2.8475 0.0019
2.7836 28.3500 225638 2.8415 0.0019
2.7812 28.4749 226632 2.8329 0.0019
2.7825 28.5998 227626 2.8422 0.0019
2.7821 28.7247 228620 2.8402 0.0019
2.7803 28.8496 229614 2.8408 0.0019
2.7811 28.9745 230608 2.8331 0.0019
2.7742 29.0994 231602 2.8390 0.0019
2.771 29.2243 232596 2.8383 0.0019
2.7734 29.3492 233590 2.8392 0.0019
2.775 29.4741 234584 2.8260 0.0019
2.7733 29.5989 235578 2.8276 0.0019
2.7716 29.7238 236572 2.8275 0.0019
2.7694 29.8487 237566 2.8329 0.0019
2.7711 29.9736 238560 2.8297 0.0019
2.762 30.0985 239554 2.8261 0.0019
2.7637 30.2234 240548 2.8294 0.0019
2.7642 30.3483 241542 2.8178 0.0019
2.7663 30.4732 242536 2.8292 0.0019
2.7637 30.5981 243530 2.8237 0.0019
2.7655 30.7230 244524 2.8220 0.0019
2.7629 30.8478 245518 2.8180 0.0019
2.7675 30.9727 246512 2.8267 0.0019
2.7612 31.0976 247506 2.8238 0.0019
2.759 31.2225 248500 2.8212 0.0019
2.7558 31.3474 249494 2.8246 0.0019
2.7558 31.4723 250488 2.8199 0.0019
2.7547 31.5972 251482 2.8222 0.0019
2.7552 31.7221 252476 2.8172 0.0019
2.7595 31.8470 253470 2.8213 0.0019
2.7537 31.9719 254464 2.8262 0.0019
2.7518 32.0967 255458 2.8190 0.0019
2.7486 32.2216 256452 2.8203 0.0019
2.7496 32.3465 257446 2.8192 0.0019
2.7499 32.4714 258440 2.8182 0.0019
2.7506 32.5963 259434 2.8093 0.0019
2.7494 32.7212 260428 2.8108 0.0019
2.7486 32.8461 261422 2.8036 0.0019
2.7465 32.9710 262416 2.8100 0.0019
2.7439 33.0959 263410 2.8041 0.0019
2.7438 33.2208 264404 2.8054 0.0019
2.7446 33.3456 265398 2.8189 0.0019
2.7432 33.4705 266392 2.8096 0.0019
2.7403 33.5954 267386 2.8061 0.0019
2.7449 33.7203 268380 2.8140 0.0019
2.74 33.8452 269374 2.8048 0.0019
2.7442 33.9701 270368 2.8042 0.0019
2.7409 34.0950 271362 2.8087 0.0019
2.7378 34.2199 272356 2.8063 0.0019
2.7374 34.3448 273350 2.8085 0.0019
2.7353 34.4697 274344 2.8043 0.0019
2.7339 34.5945 275338 2.8042 0.0019
2.7373 34.7194 276332 2.7975 0.0019
2.7338 34.8443 277326 2.8020 0.0019
2.7387 34.9692 278320 2.8032 0.0019
2.7347 35.0941 279314 2.8031 0.0019
2.7339 35.2190 280308 2.8027 0.0019
2.7321 35.3439 281302 2.8032 0.0019
2.7314 35.4688 282296 2.7986 0.0019
2.7291 35.5937 283290 2.8041 0.0019
2.7332 35.7186 284284 2.8067 0.0019
2.7295 35.8434 285278 2.8000 0.0019
2.7322 35.9683 286272 2.8040 0.0019

Framework versions

  • Transformers 4.52.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
67
Safetensors
Model size
40.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yosefw/modernbert-small-amharic-50k

Finetuned
(562)
this model
Finetunes
1 model