t5-v1_1-base-gramatika-e8-b16

This model is a fine-tuned version of google/t5-v1_1-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2980
  • Rouge1: 37.8004
  • Rouge2: 25.1687
  • Rougel: 37.0767
  • Rougelsum: 37.065
  • Gen Len: 18.9591

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
4.776 0.09 74 1.0632 32.6953 20.0972 31.9469 31.9621 18.7484
1.2729 0.18 148 0.7526 36.7533 23.3303 35.6567 35.6663 18.9461
0.9446 0.26 222 0.6354 37.1264 23.6467 36.0249 36.0251 18.9532
0.7947 0.35 296 0.5734 37.1871 23.6899 36.1041 36.1107 18.9479
0.7537 0.44 370 0.5584 37.1245 23.4797 36.0896 36.1022 18.9520
0.6918 0.53 444 0.5143 37.3209 23.6466 36.2475 36.2523 18.9509
0.6461 0.61 518 0.4959 37.362 23.9226 36.3161 36.3077 18.9550
0.6208 0.7 592 0.4934 37.3042 23.895 36.279 36.2776 18.9550
0.578 0.79 666 0.4600 36.9323 23.2291 35.8836 35.9033 18.9526
0.5595 0.88 740 0.4325 37.3255 23.9018 36.2997 36.2994 18.9544
0.5341 0.96 814 0.4401 37.6132 24.1158 36.5666 36.5629 18.9473
0.4909 1.05 888 0.4288 37.4095 23.9467 36.3822 36.3773 18.9556
0.484 1.14 962 0.4112 37.1324 23.6944 36.1397 36.146 18.9562
0.4529 1.23 1036 0.4173 37.3368 23.6993 36.3614 36.3581 18.9485
0.4491 1.31 1110 0.4031 37.6721 24.3716 36.6349 36.6283 18.9580
0.4649 1.4 1184 0.3850 37.1553 23.726 36.1654 36.1631 18.9568
0.4388 1.49 1258 0.3802 37.4997 24.1832 36.4843 36.4895 18.9597
0.436 1.58 1332 0.3751 37.7226 24.25 36.6127 36.6266 18.9562
0.4338 1.66 1406 0.3746 37.5729 24.1241 36.5254 36.5372 18.9562
0.4226 1.75 1480 0.3648 37.4497 24.2013 36.5387 36.5329 18.9556
0.4215 1.84 1554 0.3603 37.3854 23.9057 36.4769 36.4907 18.9556
0.4107 1.93 1628 0.3608 37.4492 24.2621 36.5402 36.5518 18.9574
0.3955 2.01 1702 0.3555 36.899 23.6411 36.0131 36.0335 18.9603
0.3615 2.1 1776 0.3516 36.8815 23.6418 36.0194 36.0134 18.9568
0.3641 2.19 1850 0.3494 37.6507 24.5903 36.7702 36.7744 18.9580
0.347 2.28 1924 0.3475 37.2491 23.94 36.3766 36.3915 18.9556
0.345 2.36 1998 0.3448 37.7311 24.7039 36.8714 36.8805 18.9597
0.3447 2.45 2072 0.3428 37.3581 24.439 36.5772 36.5706 18.9532
0.3513 2.54 2146 0.3449 37.5704 24.503 36.6679 36.6694 18.9532
0.3425 2.63 2220 0.3307 37.2403 24.0095 36.3901 36.4088 18.9538
0.3451 2.71 2294 0.3413 37.8927 24.9543 37.0627 37.0752 18.9515
0.337 2.8 2368 0.3295 37.2903 24.0792 36.4794 36.4851 18.9562
0.3411 2.89 2442 0.3279 37.5595 24.4696 36.6409 36.634 18.9586
0.3352 2.98 2516 0.3246 37.8787 24.9008 37.0554 37.0518 18.9520
0.2922 3.07 2590 0.3284 37.7723 24.8132 36.9398 36.9411 18.9556
0.2877 3.15 2664 0.3263 37.8679 24.9922 37.0879 37.086 18.9515
0.2821 3.24 2738 0.3272 38.1672 25.4381 37.3518 37.35 18.9562
0.2999 3.33 2812 0.3250 37.8501 25.0341 37.0643 37.053 18.9556
0.2953 3.42 2886 0.3223 37.8668 24.8381 37.0085 37.0079 18.9574
0.2892 3.5 2960 0.3180 37.7468 24.8882 36.9065 36.9151 18.9574
0.2997 3.59 3034 0.3154 37.5096 24.6657 36.6896 36.6843 18.9591
0.2924 3.68 3108 0.3153 37.8218 25.0111 37.0717 37.0657 18.9526
0.2891 3.77 3182 0.3125 37.9909 25.1394 37.185 37.1986 18.9532
0.2836 3.85 3256 0.3142 37.9429 25.2072 37.2037 37.2072 18.9591
0.2829 3.94 3330 0.3058 37.4522 24.6425 36.7227 36.7314 18.9556
0.2698 4.03 3404 0.3147 37.9525 25.2168 37.1852 37.1746 18.9562
0.2472 4.12 3478 0.3156 37.8397 24.8158 37.0507 37.0609 18.9544
0.2454 4.2 3552 0.3147 37.8964 25.1594 37.1437 37.1277 18.9568
0.2486 4.29 3626 0.3176 37.8525 25.0361 37.0716 37.0948 18.9568
0.2419 4.38 3700 0.3171 37.8339 25.1664 37.0724 37.0811 18.9580
0.2482 4.47 3774 0.3162 37.8943 25.2648 37.1299 37.1326 18.9574
0.2438 4.55 3848 0.3124 37.8348 25.1174 37.0646 37.0685 18.9538
0.2546 4.64 3922 0.3116 37.7776 25.0245 37.009 37.0062 18.9526
0.2399 4.73 3996 0.3100 37.7403 24.8735 36.9705 36.9589 18.9538
0.2439 4.82 4070 0.3063 37.6132 24.8849 36.8696 36.8678 18.9568
0.2399 4.9 4144 0.3047 38.0775 25.4368 37.3176 37.331 18.9538
0.2453 4.99 4218 0.2980 37.8004 25.1687 37.0767 37.065 18.9591
0.2113 5.08 4292 0.3156 37.8066 25.2105 37.0718 37.0732 18.9568
0.2112 5.17 4366 0.3140 37.9331 25.1857 37.2142 37.2266 18.9538
0.2073 5.25 4440 0.3130 37.7596 25.0255 37.0438 37.0355 18.9515
0.2088 5.34 4514 0.3089 37.6381 24.9435 36.9008 36.9068 18.9562
0.2096 5.43 4588 0.3133 37.6629 24.8797 36.9224 36.9201 18.9550
0.2105 5.52 4662 0.3077 37.6381 24.8911 36.9154 36.9082 18.9515
0.2137 5.6 4736 0.3107 37.9448 25.2433 37.1702 37.191 18.9538
0.2149 5.69 4810 0.3036 37.887 25.3403 37.1722 37.1505 18.9574
0.2113 5.78 4884 0.3071 37.75 25.2014 37.0775 37.061 18.9568
0.2112 5.87 4958 0.3055 37.9112 25.3054 37.2048 37.1822 18.9562
0.2207 5.96 5032 0.3043 37.7232 25.0175 36.9981 36.9904 18.9562
0.1931 6.04 5106 0.3146 37.6859 24.8467 36.9791 36.9622 18.9532
0.1794 6.13 5180 0.3192 37.6117 24.9014 36.9037 36.8909 18.9544
0.1809 6.22 5254 0.3174 37.6985 25.0269 37.0038 36.9698 18.9556
0.187 6.31 5328 0.3179 37.905 25.2766 37.1956 37.1917 18.9556
0.1857 6.39 5402 0.3121 37.7023 25.1466 37.0309 37.0343 18.9532
0.1852 6.48 5476 0.3160 37.9916 25.3421 37.2952 37.2883 18.9526
0.1901 6.57 5550 0.3130 37.7959 25.1191 37.108 37.1069 18.9550
0.1746 6.66 5624 0.3149 37.8307 25.1864 37.1278 37.111 18.9544
0.1797 6.74 5698 0.3133 37.7555 25.071 37.1049 37.0749 18.9562
0.1868 6.83 5772 0.3109 37.907 25.3167 37.2214 37.197 18.9532
0.1853 6.92 5846 0.3096 37.8557 25.2451 37.1764 37.1619 18.9538
0.1775 7.01 5920 0.3100 37.8791 25.1896 37.1719 37.1602 18.9532
0.159 7.09 5994 0.3183 37.6891 24.9679 37.0226 36.9983 18.9532
0.1633 7.18 6068 0.3191 37.8515 25.2206 37.1993 37.1785 18.9556
0.1623 7.27 6142 0.3178 37.7481 25.0795 37.0553 37.037 18.9562
0.1657 7.36 6216 0.3172 37.7833 25.1949 37.1478 37.1191 18.9532
0.1607 7.44 6290 0.3192 37.9413 25.3067 37.2541 37.2406 18.9526
0.1625 7.53 6364 0.3179 37.8266 25.2507 37.1517 37.1373 18.9532
0.1621 7.62 6438 0.3180 37.753 25.1062 37.1077 37.0825 18.9556
0.162 7.71 6512 0.3193 37.8685 25.3361 37.2299 37.1984 18.9526
0.1598 7.79 6586 0.3189 37.8672 25.2207 37.1865 37.1632 18.9526
0.1554 7.88 6660 0.3192 37.9556 25.3004 37.2645 37.2502 18.9526
0.1644 7.97 6734 0.3188 37.8834 25.2903 37.2138 37.1836 18.9526

Framework versions

  • Transformers 4.30.1
  • Pytorch 1.11.0a0+b6df043
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support