mmlu_small_noaugs_llama_lora
This model is a fine-tuned version of Daewon0808/prm800k_llama_fulltune on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.5531
- Prm accuracy: 0.8413
- Prm precision: 0.8655
- Prm recall: 0.9626
- Prm specificty: 0.1579
- Prm npv: 0.4286
- Prm f1: 0.9115
- Prm f1 neg: 0.2308
- Prm f1 auc: 0.5603
- Prm f1 auc (fixed): 0.8635
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 4
- seed: 908932403
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Prm accuracy | Prm precision | Prm recall | Prm specificty | Prm npv | Prm f1 | Prm f1 neg | Prm f1 auc | Prm f1 auc (fixed) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
No log | 0 | 0 | 0.3535 | 0.8333 | 0.8772 | 0.9346 | 0.2632 | 0.4167 | 0.9050 | 0.3226 | 0.5989 | 0.8195 |
0.3784 | 0.0229 | 5 | 0.3555 | 0.8333 | 0.8772 | 0.9346 | 0.2632 | 0.4167 | 0.9050 | 0.3226 | 0.5989 | 0.8175 |
0.406 | 0.0459 | 10 | 0.3487 | 0.8413 | 0.8718 | 0.9533 | 0.2105 | 0.4444 | 0.9107 | 0.2857 | 0.5819 | 0.8224 |
0.2865 | 0.0688 | 15 | 0.3669 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8286 |
0.1769 | 0.0917 | 20 | 0.4144 | 0.8571 | 0.856 | 1.0 | 0.0526 | 1.0 | 0.9224 | 0.1 | 0.5263 | 0.8505 |
0.3069 | 0.1147 | 25 | 0.3531 | 0.8413 | 0.8537 | 0.9813 | 0.0526 | 0.3333 | 0.9130 | 0.0909 | 0.5170 | 0.8546 |
0.2295 | 0.1376 | 30 | 0.3140 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8569 |
0.2301 | 0.1606 | 35 | 0.3168 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8706 |
0.351 | 0.1835 | 40 | 0.3087 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8751 |
0.2607 | 0.2064 | 45 | 0.2788 | 0.8730 | 0.8889 | 0.9720 | 0.3158 | 0.6667 | 0.9286 | 0.4286 | 0.6439 | 0.8721 |
0.288 | 0.2294 | 50 | 0.2909 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8741 |
0.2185 | 0.2523 | 55 | 0.2831 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8650 |
0.2307 | 0.2752 | 60 | 0.2987 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8667 |
0.1842 | 0.2982 | 65 | 0.3128 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8706 |
0.1831 | 0.3211 | 70 | 0.2936 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8667 |
0.1947 | 0.3440 | 75 | 0.3333 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8679 |
0.127 | 0.3670 | 80 | 0.2984 | 0.8492 | 0.8793 | 0.9533 | 0.2632 | 0.5 | 0.9148 | 0.3448 | 0.6082 | 0.8763 |
0.1929 | 0.3899 | 85 | 0.3202 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8785 |
0.2229 | 0.4128 | 90 | 0.2970 | 0.8492 | 0.8793 | 0.9533 | 0.2632 | 0.5 | 0.9148 | 0.3448 | 0.6082 | 0.8805 |
0.2483 | 0.4358 | 95 | 0.2941 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8758 |
0.1739 | 0.4587 | 100 | 0.2689 | 0.8571 | 0.9083 | 0.9252 | 0.4737 | 0.5294 | 0.9167 | 0.5 | 0.6995 | 0.8741 |
0.1255 | 0.4817 | 105 | 0.2814 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8792 |
0.2923 | 0.5046 | 110 | 0.2896 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8765 |
0.1591 | 0.5275 | 115 | 0.2788 | 0.8492 | 0.8793 | 0.9533 | 0.2632 | 0.5 | 0.9148 | 0.3448 | 0.6082 | 0.8674 |
0.1779 | 0.5505 | 120 | 0.3477 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8657 |
0.2207 | 0.5734 | 125 | 0.3152 | 0.8492 | 0.8667 | 0.9720 | 0.1579 | 0.5 | 0.9163 | 0.24 | 0.5649 | 0.8667 |
0.1996 | 0.5963 | 130 | 0.2898 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8647 |
0.1385 | 0.6193 | 135 | 0.2960 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8684 |
0.1334 | 0.6422 | 140 | 0.3235 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8694 |
0.2133 | 0.6651 | 145 | 0.3182 | 0.8810 | 0.8898 | 0.9813 | 0.3158 | 0.75 | 0.9333 | 0.4444 | 0.6485 | 0.8765 |
0.1889 | 0.6881 | 150 | 0.3074 | 0.8651 | 0.8879 | 0.9626 | 0.3158 | 0.6 | 0.9238 | 0.4138 | 0.6392 | 0.8751 |
0.1451 | 0.7110 | 155 | 0.3549 | 0.8492 | 0.8667 | 0.9720 | 0.1579 | 0.5 | 0.9163 | 0.24 | 0.5649 | 0.8706 |
0.205 | 0.7339 | 160 | 0.3393 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8736 |
0.1852 | 0.7569 | 165 | 0.2970 | 0.8651 | 0.8879 | 0.9626 | 0.3158 | 0.6 | 0.9238 | 0.4138 | 0.6392 | 0.8768 |
0.1195 | 0.7798 | 170 | 0.3259 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8785 |
0.2319 | 0.8028 | 175 | 0.3726 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8788 |
0.1131 | 0.8257 | 180 | 0.2880 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8795 |
0.2147 | 0.8486 | 185 | 0.2731 | 0.8889 | 0.8908 | 0.9907 | 0.3158 | 0.8571 | 0.9381 | 0.4615 | 0.6532 | 0.8802 |
0.1673 | 0.8716 | 190 | 0.3121 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8822 |
0.0806 | 0.8945 | 195 | 0.3096 | 0.8810 | 0.8833 | 0.9907 | 0.2632 | 0.8333 | 0.9339 | 0.4 | 0.6269 | 0.8815 |
0.1573 | 0.9174 | 200 | 0.2795 | 0.8730 | 0.8957 | 0.9626 | 0.3684 | 0.6364 | 0.9279 | 0.4667 | 0.6655 | 0.8812 |
0.1062 | 0.9404 | 205 | 0.3145 | 0.8810 | 0.8833 | 0.9907 | 0.2632 | 0.8333 | 0.9339 | 0.4 | 0.6269 | 0.8760 |
0.1036 | 0.9633 | 210 | 0.3353 | 0.8810 | 0.8833 | 0.9907 | 0.2632 | 0.8333 | 0.9339 | 0.4 | 0.6269 | 0.8829 |
0.1586 | 0.9862 | 215 | 0.3197 | 0.8810 | 0.8833 | 0.9907 | 0.2632 | 0.8333 | 0.9339 | 0.4 | 0.6269 | 0.8842 |
0.1319 | 1.0092 | 220 | 0.2869 | 0.8810 | 0.8833 | 0.9907 | 0.2632 | 0.8333 | 0.9339 | 0.4 | 0.6269 | 0.8790 |
0.1444 | 1.0321 | 225 | 0.3127 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8812 |
0.1346 | 1.0550 | 230 | 0.4127 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8824 |
0.1412 | 1.0780 | 235 | 0.3096 | 0.8730 | 0.8957 | 0.9626 | 0.3684 | 0.6364 | 0.9279 | 0.4667 | 0.6655 | 0.8847 |
0.0782 | 1.1009 | 240 | 0.3728 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8775 |
0.1134 | 1.1239 | 245 | 0.4411 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8709 |
0.0536 | 1.1468 | 250 | 0.3540 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8733 |
0.0687 | 1.1697 | 255 | 0.3990 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8733 |
0.0868 | 1.1927 | 260 | 0.4859 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8684 |
0.0858 | 1.2156 | 265 | 0.4038 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8679 |
0.1594 | 1.2385 | 270 | 0.3725 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8603 |
0.0289 | 1.2615 | 275 | 0.4295 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8544 |
0.121 | 1.2844 | 280 | 0.4086 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8556 |
0.133 | 1.3073 | 285 | 0.3791 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8620 |
0.0599 | 1.3303 | 290 | 0.3711 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8596 |
0.0826 | 1.3532 | 295 | 0.4366 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8571 |
0.0724 | 1.3761 | 300 | 0.4015 | 0.8730 | 0.8760 | 0.9907 | 0.2105 | 0.8 | 0.9298 | 0.3333 | 0.6006 | 0.8529 |
0.1192 | 1.3991 | 305 | 0.3474 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8608 |
0.1415 | 1.4220 | 310 | 0.3613 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8650 |
0.1567 | 1.4450 | 315 | 0.4130 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8645 |
0.0669 | 1.4679 | 320 | 0.4484 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8660 |
0.0824 | 1.4908 | 325 | 0.4422 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8687 |
0.0809 | 1.5138 | 330 | 0.4073 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8756 |
0.1214 | 1.5367 | 335 | 0.4339 | 0.8492 | 0.8667 | 0.9720 | 0.1579 | 0.5 | 0.9163 | 0.24 | 0.5649 | 0.8679 |
0.0728 | 1.5596 | 340 | 0.4479 | 0.8492 | 0.8667 | 0.9720 | 0.1579 | 0.5 | 0.9163 | 0.24 | 0.5649 | 0.8694 |
0.0982 | 1.5826 | 345 | 0.4658 | 0.8492 | 0.8607 | 0.9813 | 0.1053 | 0.5 | 0.9170 | 0.1739 | 0.5433 | 0.8660 |
0.0643 | 1.6055 | 350 | 0.4538 | 0.8571 | 0.8618 | 0.9907 | 0.1053 | 0.6667 | 0.9217 | 0.1818 | 0.5480 | 0.8637 |
0.0898 | 1.6284 | 355 | 0.3773 | 0.8413 | 0.8595 | 0.9720 | 0.1053 | 0.4 | 0.9123 | 0.1667 | 0.5386 | 0.8660 |
0.1687 | 1.6514 | 360 | 0.3235 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8714 |
0.0977 | 1.6743 | 365 | 0.3467 | 0.8651 | 0.8814 | 0.9720 | 0.2632 | 0.625 | 0.9244 | 0.3704 | 0.6176 | 0.8719 |
0.069 | 1.6972 | 370 | 0.4018 | 0.8571 | 0.8618 | 0.9907 | 0.1053 | 0.6667 | 0.9217 | 0.1818 | 0.5480 | 0.8728 |
0.1326 | 1.7202 | 375 | 0.4035 | 0.8571 | 0.8618 | 0.9907 | 0.1053 | 0.6667 | 0.9217 | 0.1818 | 0.5480 | 0.8751 |
0.0738 | 1.7431 | 380 | 0.3621 | 0.8651 | 0.8689 | 0.9907 | 0.1579 | 0.75 | 0.9258 | 0.2609 | 0.5743 | 0.8714 |
0.0988 | 1.7661 | 385 | 0.3494 | 0.8730 | 0.8824 | 0.9813 | 0.2632 | 0.7143 | 0.9292 | 0.3846 | 0.6222 | 0.8719 |
0.0759 | 1.7890 | 390 | 0.3620 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8694 |
0.1142 | 1.8119 | 395 | 0.3973 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8726 |
0.1812 | 1.8349 | 400 | 0.4123 | 0.8571 | 0.8803 | 0.9626 | 0.2632 | 0.5556 | 0.9196 | 0.3571 | 0.6129 | 0.8719 |
0.0437 | 1.8578 | 405 | 0.3928 | 0.8571 | 0.8870 | 0.9533 | 0.3158 | 0.5455 | 0.9189 | 0.4 | 0.6345 | 0.8704 |
0.1087 | 1.8807 | 410 | 0.4132 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8699 |
0.094 | 1.9037 | 415 | 0.3848 | 0.8413 | 0.8718 | 0.9533 | 0.2105 | 0.4444 | 0.9107 | 0.2857 | 0.5819 | 0.8699 |
0.082 | 1.9266 | 420 | 0.3774 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8731 |
0.0896 | 1.9495 | 425 | 0.3949 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8724 |
0.1096 | 1.9725 | 430 | 0.4198 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8728 |
0.1142 | 1.9954 | 435 | 0.4113 | 0.8651 | 0.875 | 0.9813 | 0.2105 | 0.6667 | 0.9251 | 0.32 | 0.5959 | 0.8736 |
0.0418 | 2.0183 | 440 | 0.4037 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8709 |
0.0231 | 2.0413 | 445 | 0.4156 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8687 |
0.0357 | 2.0642 | 450 | 0.4368 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8689 |
0.0396 | 2.0872 | 455 | 0.4785 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8679 |
0.0458 | 2.1101 | 460 | 0.5241 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8645 |
0.018 | 2.1330 | 465 | 0.5647 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8628 |
0.0294 | 2.1560 | 470 | 0.6041 | 0.8571 | 0.8678 | 0.9813 | 0.1579 | 0.6 | 0.9211 | 0.25 | 0.5696 | 0.8618 |
0.064 | 2.1789 | 475 | 0.5872 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8660 |
0.0708 | 2.2018 | 480 | 0.5229 | 0.8492 | 0.8793 | 0.9533 | 0.2632 | 0.5 | 0.9148 | 0.3448 | 0.6082 | 0.8645 |
0.0344 | 2.2248 | 485 | 0.4986 | 0.8571 | 0.8870 | 0.9533 | 0.3158 | 0.5455 | 0.9189 | 0.4 | 0.6345 | 0.8657 |
0.0184 | 2.2477 | 490 | 0.5377 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8655 |
0.0316 | 2.2706 | 495 | 0.5832 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8628 |
0.0133 | 2.2936 | 500 | 0.5912 | 0.8492 | 0.8667 | 0.9720 | 0.1579 | 0.5 | 0.9163 | 0.24 | 0.5649 | 0.8657 |
0.0315 | 2.3165 | 505 | 0.5803 | 0.8492 | 0.8667 | 0.9720 | 0.1579 | 0.5 | 0.9163 | 0.24 | 0.5649 | 0.8667 |
0.0599 | 2.3394 | 510 | 0.5893 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8637 |
0.0245 | 2.3624 | 515 | 0.5885 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8660 |
0.0091 | 2.3853 | 520 | 0.5829 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8662 |
0.0446 | 2.4083 | 525 | 0.5867 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8652 |
0.0185 | 2.4312 | 530 | 0.5704 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8652 |
0.02 | 2.4541 | 535 | 0.5542 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8655 |
0.0248 | 2.4771 | 540 | 0.5494 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8665 |
0.0178 | 2.5 | 545 | 0.5424 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8672 |
0.0084 | 2.5229 | 550 | 0.5434 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8650 |
0.0307 | 2.5459 | 555 | 0.5538 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8655 |
0.0414 | 2.5688 | 560 | 0.5469 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8652 |
0.0089 | 2.5917 | 565 | 0.5447 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8645 |
0.027 | 2.6147 | 570 | 0.5398 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8650 |
0.0087 | 2.6376 | 575 | 0.5417 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8660 |
0.0438 | 2.6606 | 580 | 0.5484 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8642 |
0.0238 | 2.6835 | 585 | 0.5475 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8657 |
0.0515 | 2.7064 | 590 | 0.5416 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8635 |
0.1527 | 2.7294 | 595 | 0.5277 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8650 |
0.0587 | 2.7523 | 600 | 0.5274 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8660 |
0.0062 | 2.7752 | 605 | 0.5361 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8655 |
0.0279 | 2.7982 | 610 | 0.5427 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8637 |
0.0046 | 2.8211 | 615 | 0.5462 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8635 |
0.0156 | 2.8440 | 620 | 0.5533 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8625 |
0.0077 | 2.8670 | 625 | 0.5456 | 0.8492 | 0.8729 | 0.9626 | 0.2105 | 0.5 | 0.9156 | 0.2963 | 0.5866 | 0.8650 |
0.0085 | 2.8899 | 630 | 0.5465 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8637 |
0.0179 | 2.9128 | 635 | 0.5492 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8630 |
0.0571 | 2.9358 | 640 | 0.5421 | 0.8571 | 0.8739 | 0.9720 | 0.2105 | 0.5714 | 0.9204 | 0.3077 | 0.5912 | 0.8640 |
0.0239 | 2.9587 | 645 | 0.5471 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8665 |
0.0317 | 2.9817 | 650 | 0.5531 | 0.8413 | 0.8655 | 0.9626 | 0.1579 | 0.4286 | 0.9115 | 0.2308 | 0.5603 | 0.8635 |
Framework versions
- PEFT 0.12.0
- Transformers 4.46.0
- Pytorch 2.4.0+cu118
- Datasets 3.0.0
- Tokenizers 0.20.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Daewon0808/mmlu_small_noaugs_llama_lora
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct
Finetuned
Daewon0808/prm800k_llama_fulltune