train_stsb_1745333595

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2668
  • Num Input Tokens Seen: 61177152

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3915 0.6182 200 0.4912 304960
0.2668 1.2349 400 0.3558 610112
0.3027 1.8532 600 0.3570 918240
0.2573 2.4699 800 0.3305 1223440
0.2581 3.0866 1000 0.3148 1529568
0.4142 3.7048 1200 0.3205 1838464
0.2769 4.3215 1400 0.3080 2144560
0.2291 4.9397 1600 0.2977 2450736
0.2145 5.5564 1800 0.3139 2755856
0.2554 6.1731 2000 0.2992 3063440
0.2521 6.7913 2200 0.2902 3368976
0.3044 7.4080 2400 0.2987 3677040
0.2534 8.0247 2600 0.3155 3983872
0.2416 8.6430 2800 0.2891 4292480
0.2533 9.2597 3000 0.2984 4594560
0.221 9.8779 3200 0.2846 4900544
0.2279 10.4946 3400 0.2847 5206928
0.2165 11.1113 3600 0.2820 5511472
0.2089 11.7295 3800 0.2890 5815280
0.2102 12.3462 4000 0.2872 6122240
0.2472 12.9645 4200 0.2774 6427616
0.2119 13.5811 4400 0.2743 6733776
0.2093 14.1978 4600 0.2818 7038848
0.2802 14.8161 4800 0.2703 7344384
0.1749 15.4328 5000 0.2951 7651280
0.1811 16.0495 5200 0.2856 7955504
0.1694 16.6677 5400 0.2910 8262864
0.1876 17.2844 5600 0.2822 8568256
0.195 17.9026 5800 0.2668 8873856
0.1784 18.5193 6000 0.2850 9180288
0.1717 19.1360 6200 0.2957 9486288
0.2038 19.7543 6400 0.2840 9792720
0.1578 20.3709 6600 0.3026 10100576
0.1607 20.9892 6800 0.3016 10406848
0.161 21.6059 7000 0.3050 10713296
0.1826 22.2226 7200 0.3285 11016800
0.1649 22.8408 7400 0.3289 11325536
0.1486 23.4575 7600 0.3313 11631392
0.1468 24.0742 7800 0.3300 11936144
0.165 24.6924 8000 0.3389 12244560
0.135 25.3091 8200 0.3384 12549728
0.1594 25.9274 8400 0.3279 12858400
0.1427 26.5440 8600 0.3523 13163216
0.1324 27.1607 8800 0.3634 13469440
0.1621 27.7790 9000 0.3703 13774400
0.1272 28.3957 9200 0.3663 14082512
0.1359 29.0124 9400 0.3360 14385408
0.1242 29.6306 9600 0.3765 14692096
0.1259 30.2473 9800 0.3994 14996480
0.1339 30.8655 10000 0.4001 15302624
0.1115 31.4822 10200 0.4308 15609936
0.0909 32.0989 10400 0.4408 15915040
0.1211 32.7172 10600 0.4281 16222112
0.0754 33.3338 10800 0.4210 16525360
0.1127 33.9521 11000 0.4296 16833040
0.0717 34.5688 11200 0.5212 17138928
0.0851 35.1855 11400 0.4581 17446224
0.0824 35.8037 11600 0.4812 17754192
0.0805 36.4204 11800 0.5338 18056816
0.0802 37.0371 12000 0.5045 18365904
0.0824 37.6553 12200 0.5051 18669424
0.0572 38.2720 12400 0.5928 18975680
0.0689 38.8903 12600 0.5256 19284128
0.0877 39.5070 12800 0.5032 19589440
0.069 40.1236 13000 0.5432 19892304
0.054 40.7419 13200 0.5622 20201904
0.0571 41.3586 13400 0.5713 20507296
0.0491 41.9768 13600 0.5488 20814240
0.0494 42.5935 13800 0.5668 21117472
0.0611 43.2102 14000 0.5911 21424352
0.0625 43.8284 14200 0.5955 21729344
0.0496 44.4451 14400 0.6157 22035168
0.0371 45.0618 14600 0.6133 22341904
0.0471 45.6801 14800 0.6059 22646640
0.026 46.2968 15000 0.6613 22952944
0.0351 46.9150 15200 0.6269 23260240
0.0379 47.5317 15400 0.6851 23566048
0.0321 48.1484 15600 0.6763 23871504
0.0373 48.7666 15800 0.6797 24175696
0.0284 49.3833 16000 0.7029 24480832
0.0392 50.0 16200 0.6647 24786896
0.0222 50.6182 16400 0.7084 25092208
0.0276 51.2349 16600 0.6964 25398288
0.0237 51.8532 16800 0.7134 25707024
0.0207 52.4699 17000 0.7202 26010848
0.0133 53.0866 17200 0.7254 26319616
0.0251 53.7048 17400 0.7938 26623232
0.0288 54.3215 17600 0.7129 26932512
0.0173 54.9397 17800 0.7487 27238304
0.0165 55.5564 18000 0.7846 27542688
0.0194 56.1731 18200 0.7524 27848608
0.0144 56.7913 18400 0.7748 28156128
0.014 57.4080 18600 0.7994 28463824
0.0309 58.0247 18800 0.7437 28768304
0.0076 58.6430 19000 0.8176 29076400
0.0105 59.2597 19200 0.7934 29381968
0.0147 59.8779 19400 0.7984 29688144
0.0109 60.4946 19600 0.8231 29993744
0.0102 61.1113 19800 0.8141 30299024
0.0185 61.7295 20000 0.8043 30604816
0.0121 62.3462 20200 0.7951 30909520
0.02 62.9645 20400 0.8212 31217744
0.0073 63.5811 20600 0.8317 31523296
0.01 64.1978 20800 0.8920 31827424
0.0121 64.8161 21000 0.8372 32135904
0.0106 65.4328 21200 0.8723 32439120
0.0061 66.0495 21400 0.8708 32747712
0.0116 66.6677 21600 0.8543 33052672
0.0114 67.2844 21800 0.8384 33358560
0.0061 67.9026 22000 0.8718 33664736
0.0062 68.5193 22200 0.8799 33967392
0.0069 69.1360 22400 0.8613 34272592
0.003 69.7543 22600 0.9006 34578896
0.0019 70.3709 22800 0.9288 34883440
0.0014 70.9892 23000 0.9470 35188496
0.0039 71.6059 23200 0.9528 35492880
0.0026 72.2226 23400 0.9618 35798304
0.0027 72.8408 23600 0.9678 36105856
0.0085 73.4575 23800 0.8849 36408816
0.0064 74.0742 24000 0.8983 36716560
0.0068 74.6924 24200 0.8801 37025168
0.0039 75.3091 24400 0.9125 37330368
0.0085 75.9274 24600 0.8928 37636736
0.0017 76.5440 24800 0.9118 37941312
0.0008 77.1607 25000 0.9296 38246144
0.0035 77.7790 25200 0.9426 38552576
0.0006 78.3957 25400 0.9587 38857104
0.0008 79.0124 25600 0.9646 39165040
0.0009 79.6306 25800 0.9722 39472304
0.0005 80.2473 26000 0.9845 39777616
0.0025 80.8655 26200 0.9913 40084368
0.0075 81.4822 26400 0.9292 40388032
0.0044 82.0989 26600 0.8865 40694320
0.0128 82.7172 26800 0.9107 41001712
0.0022 83.3338 27000 0.9214 41305200
0.0012 83.9521 27200 0.9427 41615216
0.0008 84.5688 27400 0.9602 41920400
0.0003 85.1855 27600 0.9745 42224944
0.0004 85.8037 27800 0.9833 42528304
0.0003 86.4204 28000 0.9964 42836528
0.0002 87.0371 28200 0.9940 43141440
0.0002 87.6553 28400 1.0034 43445216
0.0004 88.2720 28600 1.0097 43750304
0.0002 88.8903 28800 1.0162 44055584
0.0032 89.5070 29000 1.0219 44361616
0.0006 90.1236 29200 1.0235 44665936
0.0048 90.7419 29400 1.0258 44972144
0.0001 91.3586 29600 1.0315 45276416
0.0003 91.9768 29800 1.0406 45583712
0.0002 92.5935 30000 1.0404 45888688
0.0001 93.2102 30200 1.0478 46195456
0.0002 93.8284 30400 1.0460 46500288
0.0001 94.4451 30600 1.0491 46804992
0.0001 95.0618 30800 1.0544 47112576
0.0001 95.6801 31000 1.0624 47418816
0.0002 96.2968 31200 1.0587 47723232
0.0016 96.9150 31400 1.0241 48029888
0.0096 97.5317 31600 1.0041 48335504
0.0004 98.1484 31800 0.9700 48640352
0.0007 98.7666 32000 0.9848 48945632
0.0013 99.3833 32200 0.9948 49253952
0.0003 100.0 32400 0.9980 49557760
0.0001 100.6182 32600 1.0072 49863392
0.0007 101.2349 32800 1.0100 50171184
0.0002 101.8532 33000 1.0170 50477424
0.0008 102.4699 33200 1.0209 50781472
0.0001 103.0866 33400 1.0264 51085008
0.0001 103.7048 33600 1.0309 51393296
0.0011 104.3215 33800 1.0308 51697808
0.0001 104.9397 34000 1.0317 52004880
0.0001 105.5564 34200 1.0370 52308944
0.0006 106.1731 34400 1.0422 52616512
0.0002 106.7913 34600 1.0431 52921600
0.0001 107.4080 34800 1.0459 53227040
0.0001 108.0247 35000 1.0487 53533488
0.0001 108.6430 35200 1.0537 53838704
0.0001 109.2597 35400 1.0533 54143984
0.0001 109.8779 35600 1.0562 54449808
0.0001 110.4946 35800 1.0580 54754304
0.0001 111.1113 36000 1.0632 55060864
0.0001 111.7295 36200 1.0636 55367296
0.0001 112.3462 36400 1.0645 55670672
0.0001 112.9645 36600 1.0676 55978256
0.0001 113.5811 36800 1.0657 56283024
0.0006 114.1978 37000 1.0700 56590928
0.0001 114.8161 37200 1.0715 56897936
0.0007 115.4328 37400 1.0732 57200192
0.0002 116.0495 37600 1.0714 57505872
0.0001 116.6677 37800 1.0744 57811120
0.0001 117.2844 38000 1.0752 58116320
0.0001 117.9026 38200 1.0750 58425376
0.0001 118.5193 38400 1.0758 58732208
0.0001 119.1360 38600 1.0753 59038688
0.0001 119.7543 38800 1.0777 59342656
0.0 120.3709 39000 1.0752 59647664
0.0001 120.9892 39200 1.0771 59954128
0.0002 121.6059 39400 1.0752 60260256
0.0001 122.2226 39600 1.0748 60563120
0.0001 122.8408 39800 1.0767 60870320
0.0001 123.4575 40000 1.0755 61177152

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_1745333595

Adapter
(471)
this model