ht-stmini-cls-v6_ftis_noPretrain-msm-pos
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.8172
- Accuracy: 0.8925
- Macro F1: 0.7511
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 4
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 6733
- training_steps: 134675
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 |
---|---|---|---|---|---|
59.3464 | 0.0013 | 169 | 38.9204 | 0.0790 | 0.0371 |
25.3687 | 1.0012 | 338 | 95.5343 | 0.1876 | 0.0584 |
7.8147 | 2.0012 | 507 | 195.3371 | 0.4964 | 0.1235 |
6.8891 | 3.0012 | 676 | 191.6926 | 0.5398 | 0.1325 |
6.2177 | 4.0012 | 845 | 216.1844 | 0.5697 | 0.1383 |
5.4678 | 5.0012 | 1014 | 189.3503 | 0.5881 | 0.1441 |
4.9594 | 6.0012 | 1183 | 137.5813 | 0.5951 | 0.1504 |
4.1078 | 7.0012 | 1352 | 94.4770 | 0.6093 | 0.1529 |
3.5533 | 8.0012 | 1521 | 64.6198 | 0.6143 | 0.1643 |
3.2608 | 9.0012 | 1690 | 45.8943 | 0.6192 | 0.1693 |
2.9641 | 10.0012 | 1859 | 39.1380 | 0.6195 | 0.1724 |
2.7522 | 11.0012 | 2028 | 33.9856 | 0.6396 | 0.1836 |
2.7176 | 12.0012 | 2197 | 29.4991 | 0.6150 | 0.1799 |
2.5893 | 13.0012 | 2366 | 22.8487 | 0.6356 | 0.2027 |
2.457 | 14.0012 | 2535 | 19.0762 | 0.6483 | 0.2075 |
2.4692 | 15.0011 | 2704 | 16.5434 | 0.6498 | 0.2202 |
2.309 | 16.0011 | 2873 | 14.8761 | 0.6640 | 0.2347 |
2.3723 | 17.0011 | 3042 | 12.4455 | 0.6624 | 0.2501 |
2.2035 | 18.0011 | 3211 | 12.1223 | 0.6490 | 0.2582 |
2.1096 | 19.0011 | 3380 | 12.5485 | 0.6854 | 0.2777 |
1.9845 | 20.0011 | 3549 | 10.0986 | 0.6758 | 0.2975 |
1.9108 | 21.0011 | 3718 | 10.0700 | 0.6911 | 0.3001 |
1.9174 | 22.0011 | 3887 | 6.9076 | 0.7244 | 0.3483 |
1.8328 | 23.0011 | 4056 | 9.2308 | 0.7231 | 0.3387 |
1.6883 | 24.0011 | 4225 | 7.6798 | 0.7277 | 0.3699 |
1.6764 | 25.0011 | 4394 | 7.3750 | 0.7291 | 0.3764 |
1.5494 | 26.0011 | 4563 | 7.9343 | 0.7419 | 0.3963 |
1.4642 | 27.0011 | 4732 | 10.0410 | 0.7604 | 0.4199 |
1.3226 | 28.0010 | 4901 | 7.8654 | 0.7586 | 0.4226 |
1.3147 | 29.0010 | 5070 | 9.4925 | 0.7783 | 0.4468 |
1.2191 | 30.0010 | 5239 | 9.3531 | 0.7751 | 0.4472 |
1.2129 | 31.0010 | 5408 | 9.6055 | 0.7794 | 0.4657 |
1.1316 | 32.0010 | 5577 | 9.9082 | 0.7883 | 0.4837 |
1.1428 | 33.0010 | 5746 | 10.4082 | 0.7953 | 0.4963 |
1.086 | 34.0010 | 5915 | 10.5887 | 0.8013 | 0.5049 |
0.9935 | 35.0010 | 6084 | 9.5130 | 0.7909 | 0.4851 |
0.976 | 36.0010 | 6253 | 11.5568 | 0.7926 | 0.5008 |
0.9635 | 37.0010 | 6422 | 14.4823 | 0.8069 | 0.5124 |
0.8799 | 38.0010 | 6591 | 12.5622 | 0.7883 | 0.5165 |
0.9134 | 39.0010 | 6760 | 14.6331 | 0.8065 | 0.5279 |
0.7942 | 40.0010 | 6929 | 14.0959 | 0.8103 | 0.5368 |
0.7707 | 41.0010 | 7098 | 14.0539 | 0.7993 | 0.5011 |
0.7535 | 42.0009 | 7267 | 12.6058 | 0.8070 | 0.5297 |
0.6551 | 43.0009 | 7436 | 16.7901 | 0.8179 | 0.5502 |
0.6933 | 44.0009 | 7605 | 15.4039 | 0.8189 | 0.5602 |
0.6253 | 45.0009 | 7774 | 17.1676 | 0.8196 | 0.5664 |
0.5847 | 46.0009 | 7943 | 17.8274 | 0.8237 | 0.5642 |
0.5747 | 47.0009 | 8112 | 18.4719 | 0.8216 | 0.5661 |
0.5041 | 48.0009 | 8281 | 17.1367 | 0.8210 | 0.5588 |
0.5139 | 49.0009 | 8450 | 16.5208 | 0.8356 | 0.5884 |
0.4726 | 50.0009 | 8619 | 21.7155 | 0.8230 | 0.5779 |
0.4511 | 51.0009 | 8788 | 22.6982 | 0.8353 | 0.5928 |
0.4668 | 52.0009 | 8957 | 19.2348 | 0.8347 | 0.5851 |
0.3898 | 53.0009 | 9126 | 18.7129 | 0.8342 | 0.5923 |
0.3695 | 54.0009 | 9295 | 18.8422 | 0.8399 | 0.5990 |
0.3801 | 55.0008 | 9464 | 18.5114 | 0.8413 | 0.6028 |
0.3443 | 56.0008 | 9633 | 16.4898 | 0.8432 | 0.6050 |
0.3616 | 57.0008 | 9802 | 18.9793 | 0.8486 | 0.6226 |
0.3354 | 58.0008 | 9971 | 17.3367 | 0.8461 | 0.6132 |
0.3294 | 59.0008 | 10140 | 18.4233 | 0.8450 | 0.6191 |
0.3136 | 60.0008 | 10309 | 18.7376 | 0.8482 | 0.6177 |
0.2854 | 61.0008 | 10478 | 15.7951 | 0.8552 | 0.6289 |
0.2838 | 62.0008 | 10647 | 17.5900 | 0.8503 | 0.6238 |
0.2536 | 63.0008 | 10816 | 16.5078 | 0.8525 | 0.6307 |
0.2617 | 64.0008 | 10985 | 17.8552 | 0.8454 | 0.6350 |
0.2461 | 65.0008 | 11154 | 16.2489 | 0.8569 | 0.6432 |
0.237 | 66.0008 | 11323 | 12.4754 | 0.8532 | 0.6423 |
0.2176 | 67.0008 | 11492 | 10.9631 | 0.8579 | 0.6467 |
0.2119 | 68.0007 | 11661 | 13.9939 | 0.8601 | 0.6474 |
0.2068 | 69.0007 | 11830 | 12.4334 | 0.8532 | 0.6404 |
0.1956 | 70.0007 | 11999 | 14.0338 | 0.8593 | 0.6504 |
0.1878 | 71.0007 | 12168 | 11.7233 | 0.8602 | 0.6506 |
0.1875 | 72.0007 | 12337 | 10.3088 | 0.8642 | 0.6645 |
0.2062 | 73.0007 | 12506 | 9.0553 | 0.8611 | 0.6599 |
0.1822 | 74.0007 | 12675 | 7.1633 | 0.8624 | 0.6625 |
0.171 | 75.0007 | 12844 | 8.1621 | 0.8593 | 0.6589 |
0.165 | 76.0007 | 13013 | 7.7144 | 0.8647 | 0.6712 |
0.1602 | 77.0007 | 13182 | 8.1336 | 0.8627 | 0.6610 |
0.1586 | 78.0007 | 13351 | 6.4341 | 0.8619 | 0.6650 |
0.154 | 79.0007 | 13520 | 6.1606 | 0.8663 | 0.6677 |
0.1455 | 80.0007 | 13689 | 6.5879 | 0.8625 | 0.6630 |
0.1442 | 81.0007 | 13858 | 6.1570 | 0.8701 | 0.6782 |
0.1419 | 82.0006 | 14027 | 6.2919 | 0.8675 | 0.6719 |
0.1372 | 83.0006 | 14196 | 5.1177 | 0.8662 | 0.6750 |
0.1181 | 84.0006 | 14365 | 4.8350 | 0.8666 | 0.6795 |
0.1252 | 85.0006 | 14534 | 4.3481 | 0.8711 | 0.6811 |
0.1269 | 86.0006 | 14703 | 4.7319 | 0.8656 | 0.6819 |
0.123 | 87.0006 | 14872 | 4.7458 | 0.8714 | 0.6839 |
0.1278 | 88.0006 | 15041 | 3.9598 | 0.8708 | 0.6858 |
0.1122 | 89.0006 | 15210 | 3.5908 | 0.8754 | 0.6937 |
0.117 | 90.0006 | 15379 | 3.8412 | 0.8721 | 0.6865 |
0.1141 | 91.0006 | 15548 | 3.8014 | 0.8730 | 0.6889 |
0.1064 | 92.0006 | 15717 | 3.6323 | 0.8726 | 0.6889 |
0.1106 | 93.0006 | 15886 | 3.5584 | 0.8734 | 0.6908 |
0.1019 | 94.0006 | 16055 | 3.3942 | 0.8738 | 0.6897 |
0.0961 | 95.0005 | 16224 | 3.2197 | 0.8749 | 0.6961 |
0.0992 | 96.0005 | 16393 | 3.3110 | 0.8706 | 0.6976 |
0.1046 | 97.0005 | 16562 | 2.9395 | 0.8757 | 0.6981 |
0.0967 | 98.0005 | 16731 | 3.0819 | 0.8715 | 0.6927 |
0.0852 | 99.0005 | 16900 | 2.7328 | 0.8725 | 0.6929 |
0.0909 | 100.0005 | 17069 | 2.6428 | 0.8763 | 0.7052 |
0.0884 | 101.0005 | 17238 | 2.6717 | 0.8754 | 0.7019 |
0.0877 | 102.0005 | 17407 | 2.6857 | 0.8721 | 0.7029 |
0.0787 | 103.0005 | 17576 | 2.5938 | 0.8768 | 0.7032 |
0.0849 | 104.0005 | 17745 | 2.5608 | 0.8771 | 0.7090 |
0.08 | 105.0005 | 17914 | 2.5091 | 0.8756 | 0.6984 |
0.074 | 106.0005 | 18083 | 2.5128 | 0.8803 | 0.7063 |
0.0714 | 107.0005 | 18252 | 2.4161 | 0.8773 | 0.7068 |
0.0771 | 108.0005 | 18421 | 2.2031 | 0.8794 | 0.7094 |
0.0895 | 109.0004 | 18590 | 2.4464 | 0.8804 | 0.7163 |
0.0728 | 110.0004 | 18759 | 2.3420 | 0.8759 | 0.7125 |
0.0971 | 111.0004 | 18928 | 2.3484 | 0.8781 | 0.7017 |
0.0752 | 112.0004 | 19097 | 2.2271 | 0.8790 | 0.7178 |
0.067 | 113.0004 | 19266 | 2.4369 | 0.8765 | 0.7120 |
0.0741 | 114.0004 | 19435 | 2.2862 | 0.8777 | 0.7139 |
0.0738 | 115.0004 | 19604 | 2.2621 | 0.8785 | 0.7091 |
0.062 | 116.0004 | 19773 | 2.3678 | 0.8787 | 0.7111 |
0.0747 | 117.0004 | 19942 | 2.4907 | 0.8745 | 0.6961 |
0.0653 | 118.0004 | 20111 | 2.1874 | 0.8827 | 0.7161 |
0.0658 | 119.0004 | 20280 | 2.0325 | 0.8838 | 0.7217 |
0.0652 | 120.0004 | 20449 | 2.1266 | 0.8820 | 0.7156 |
0.0597 | 121.0004 | 20618 | 2.1654 | 0.8801 | 0.7149 |
0.0757 | 122.0003 | 20787 | 1.9499 | 0.8866 | 0.7287 |
0.0593 | 123.0003 | 20956 | 2.2619 | 0.8791 | 0.7118 |
0.0576 | 124.0003 | 21125 | 2.1054 | 0.8829 | 0.7184 |
0.0616 | 125.0003 | 21294 | 2.1809 | 0.8796 | 0.7182 |
0.0601 | 126.0003 | 21463 | 1.8629 | 0.8872 | 0.7277 |
0.0548 | 127.0003 | 21632 | 2.1090 | 0.8821 | 0.7166 |
0.0572 | 128.0003 | 21801 | 2.2149 | 0.8784 | 0.7153 |
0.0561 | 129.0003 | 21970 | 1.9411 | 0.8856 | 0.7303 |
0.0554 | 130.0003 | 22139 | 2.0311 | 0.8813 | 0.7183 |
0.0509 | 131.0003 | 22308 | 2.1824 | 0.8839 | 0.7188 |
0.0616 | 132.0003 | 22477 | 2.1547 | 0.8830 | 0.7189 |
0.0511 | 133.0003 | 22646 | 2.0628 | 0.8816 | 0.7195 |
0.0491 | 134.0003 | 22815 | 1.9718 | 0.8859 | 0.7266 |
0.0564 | 135.0003 | 22984 | 2.0992 | 0.8835 | 0.7223 |
0.0477 | 136.0002 | 23153 | 1.9783 | 0.8828 | 0.7216 |
0.049 | 137.0002 | 23322 | 2.1633 | 0.8851 | 0.7205 |
0.0448 | 138.0002 | 23491 | 2.0866 | 0.8815 | 0.7248 |
0.0508 | 139.0002 | 23660 | 2.2082 | 0.8861 | 0.7229 |
0.0561 | 140.0002 | 23829 | 2.0297 | 0.8831 | 0.7263 |
0.0476 | 141.0002 | 23998 | 2.1306 | 0.8822 | 0.7247 |
0.0562 | 142.0002 | 24167 | 2.2599 | 0.8819 | 0.7182 |
0.0433 | 143.0002 | 24336 | 2.0900 | 0.8846 | 0.7249 |
0.0454 | 144.0002 | 24505 | 2.0576 | 0.8852 | 0.7280 |
0.0439 | 145.0002 | 24674 | 1.9070 | 0.8861 | 0.7354 |
0.0427 | 146.0002 | 24843 | 1.9406 | 0.8859 | 0.7255 |
0.0435 | 147.0002 | 25012 | 1.8753 | 0.8858 | 0.7350 |
0.0386 | 148.0002 | 25181 | 2.0958 | 0.8831 | 0.7261 |
0.0421 | 149.0001 | 25350 | 2.0531 | 0.8840 | 0.7237 |
0.0418 | 150.0001 | 25519 | 2.0721 | 0.8848 | 0.7228 |
0.0392 | 151.0001 | 25688 | 2.2118 | 0.8828 | 0.7232 |
0.0431 | 152.0001 | 25857 | 2.3609 | 0.8840 | 0.7200 |
0.0449 | 153.0001 | 26026 | 1.9814 | 0.8851 | 0.7302 |
0.0395 | 154.0001 | 26195 | 1.9841 | 0.8889 | 0.7342 |
0.0428 | 155.0001 | 26364 | 2.1003 | 0.8851 | 0.7314 |
0.0378 | 156.0001 | 26533 | 2.0011 | 0.8899 | 0.7357 |
0.0417 | 157.0001 | 26702 | 1.9045 | 0.8885 | 0.7313 |
0.0369 | 158.0001 | 26871 | 2.1463 | 0.8817 | 0.7237 |
0.0381 | 159.0001 | 27040 | 1.8939 | 0.8891 | 0.7359 |
0.039 | 160.0001 | 27209 | 2.1339 | 0.8860 | 0.7280 |
0.0409 | 161.0001 | 27378 | 2.0200 | 0.8849 | 0.7332 |
0.0357 | 162.0001 | 27547 | 1.9856 | 0.8894 | 0.7385 |
0.0388 | 163.0000 | 27716 | 1.8941 | 0.8889 | 0.7352 |
0.0405 | 164.0000 | 27885 | 2.1008 | 0.8845 | 0.7219 |
0.0392 | 165.0000 | 28054 | 2.0878 | 0.8837 | 0.7295 |
0.0361 | 166.0000 | 28223 | 1.8980 | 0.8880 | 0.7381 |
0.0353 | 167.0000 | 28392 | 1.9293 | 0.8882 | 0.7324 |
0.0344 | 168.0000 | 28561 | 1.8784 | 0.8889 | 0.7405 |
0.0327 | 168.0013 | 28730 | 1.8787 | 0.8903 | 0.7377 |
0.032 | 169.0013 | 28899 | 2.2492 | 0.8857 | 0.7300 |
0.0348 | 170.0012 | 29068 | 2.1504 | 0.8874 | 0.7359 |
0.0377 | 171.0012 | 29237 | 2.1245 | 0.8897 | 0.7373 |
0.0343 | 172.0012 | 29406 | 1.9574 | 0.8892 | 0.7370 |
0.0337 | 173.0012 | 29575 | 1.9612 | 0.8885 | 0.7381 |
0.0375 | 174.0012 | 29744 | 2.0124 | 0.8905 | 0.7427 |
0.0314 | 175.0012 | 29913 | 2.0747 | 0.8893 | 0.7352 |
0.0328 | 176.0012 | 30082 | 2.0896 | 0.8876 | 0.7319 |
0.0313 | 177.0012 | 30251 | 1.9872 | 0.8885 | 0.7355 |
0.0325 | 178.0012 | 30420 | 2.3101 | 0.8868 | 0.7323 |
0.0323 | 179.0012 | 30589 | 1.8263 | 0.8925 | 0.7511 |
0.0368 | 180.0012 | 30758 | 1.9961 | 0.8904 | 0.7403 |
0.0325 | 181.0012 | 30927 | 2.0162 | 0.8897 | 0.7433 |
0.0318 | 182.0012 | 31096 | 1.9951 | 0.8882 | 0.7396 |
0.033 | 183.0012 | 31265 | 1.7084 | 0.8951 | 0.7484 |
0.0308 | 184.0011 | 31434 | 2.0011 | 0.8884 | 0.7386 |
0.0302 | 185.0011 | 31603 | 1.9691 | 0.8909 | 0.7389 |
0.0312 | 186.0011 | 31772 | 2.0769 | 0.8913 | 0.7410 |
0.0334 | 187.0011 | 31941 | 2.1664 | 0.8862 | 0.7300 |
0.0295 | 188.0011 | 32110 | 2.1281 | 0.8898 | 0.7387 |
0.0287 | 189.0011 | 32279 | 2.3105 | 0.8861 | 0.7256 |
0.033 | 190.0011 | 32448 | 1.9435 | 0.8925 | 0.7389 |
0.029 | 191.0011 | 32617 | 2.0848 | 0.8894 | 0.7417 |
0.032 | 192.0011 | 32786 | 2.0616 | 0.8869 | 0.7329 |
0.029 | 193.0011 | 32955 | 2.1727 | 0.8867 | 0.7336 |
0.0254 | 194.0011 | 33124 | 2.1293 | 0.8911 | 0.7405 |
0.0263 | 195.0011 | 33293 | 2.0403 | 0.8916 | 0.7358 |
0.0283 | 196.0011 | 33462 | 2.1992 | 0.8904 | 0.7407 |
0.0308 | 197.0010 | 33631 | 2.1202 | 0.8900 | 0.7379 |
0.0274 | 198.0010 | 33800 | 2.0473 | 0.8911 | 0.7401 |
0.0255 | 199.0010 | 33969 | 2.1266 | 0.8914 | 0.7441 |
Framework versions
- Transformers 4.46.0
- Pytorch 2.3.1+cu121
- Datasets 2.20.0
- Tokenizers 0.20.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support