fineweb-edu-scorer-mdeberta-binary
This model is a fine-tuned version of microsoft/mdeberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.0644
- Precision: 0.8738
- Recall: 0.5228
- F1 Macro: 0.5213
- Accuracy: 0.9132
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 128
- eval_batch_size: 256
- seed: 0
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
---|---|---|---|---|---|---|---|
No log | 0 | 0 | 0.1176 | 0.4549 | 0.5 | 0.4764 | 0.9098 |
0.0765 | 0.3042 | 1000 | 0.0758 | 0.4549 | 0.5 | 0.4764 | 0.9098 |
0.0715 | 0.6085 | 2000 | 0.0723 | 0.4549 | 0.5 | 0.4764 | 0.9098 |
0.0703 | 0.9127 | 3000 | 0.0718 | 0.6216 | 0.5003 | 0.4771 | 0.9098 |
0.0696 | 1.2169 | 4000 | 0.0714 | 0.5799 | 0.5001 | 0.4766 | 0.9098 |
0.0703 | 1.5211 | 5000 | 0.0693 | 0.6633 | 0.5005 | 0.4776 | 0.9098 |
0.07 | 1.8254 | 6000 | 0.0688 | 0.7407 | 0.5004 | 0.4773 | 0.9099 |
0.0686 | 2.1296 | 7000 | 0.0694 | 0.8123 | 0.5023 | 0.4812 | 0.9101 |
0.0703 | 2.4338 | 8000 | 0.0733 | 0.8295 | 0.5172 | 0.5109 | 0.9120 |
0.0687 | 2.7381 | 9000 | 0.0681 | 0.8479 | 0.5013 | 0.4790 | 0.9100 |
0.0671 | 3.0423 | 10000 | 0.0678 | 0.9021 | 0.5049 | 0.4864 | 0.9106 |
0.0684 | 3.3465 | 11000 | 0.0674 | 0.8906 | 0.5055 | 0.4876 | 0.9107 |
0.069 | 3.6507 | 12000 | 0.0676 | 0.8979 | 0.5081 | 0.4928 | 0.9111 |
0.0681 | 3.9550 | 13000 | 0.0679 | 0.8975 | 0.5027 | 0.4819 | 0.9103 |
0.0713 | 4.2592 | 14000 | 0.0673 | 0.8769 | 0.5112 | 0.4990 | 0.9115 |
0.0695 | 4.5634 | 15000 | 0.0670 | 0.9100 | 0.5059 | 0.4883 | 0.9108 |
0.0684 | 4.8677 | 16000 | 0.0672 | 0.9100 | 0.5059 | 0.4883 | 0.9108 |
0.0657 | 5.1719 | 17000 | 0.0668 | 0.8835 | 0.5131 | 0.5027 | 0.9118 |
0.0672 | 5.4761 | 18000 | 0.0664 | 0.9108 | 0.5095 | 0.4956 | 0.9114 |
0.0681 | 5.7803 | 19000 | 0.0683 | 0.9026 | 0.5040 | 0.4846 | 0.9105 |
0.066 | 6.0846 | 20000 | 0.0667 | 0.8911 | 0.5164 | 0.5090 | 0.9124 |
0.0669 | 6.3888 | 21000 | 0.0664 | 0.9133 | 0.5076 | 0.4919 | 0.9111 |
0.0678 | 6.6930 | 22000 | 0.0662 | 0.8915 | 0.5150 | 0.5063 | 0.9122 |
0.0679 | 6.9973 | 23000 | 0.0664 | 0.8685 | 0.5201 | 0.5162 | 0.9128 |
0.0659 | 7.3015 | 24000 | 0.0658 | 0.8981 | 0.5125 | 0.5016 | 0.9118 |
0.067 | 7.6057 | 25000 | 0.0658 | 0.9018 | 0.5116 | 0.4997 | 0.9117 |
0.0654 | 7.9099 | 26000 | 0.0658 | 0.9058 | 0.5106 | 0.4977 | 0.9116 |
0.0651 | 8.2142 | 27000 | 0.0659 | 0.8758 | 0.5176 | 0.5114 | 0.9125 |
0.067 | 8.5184 | 28000 | 0.0664 | 0.9107 | 0.5083 | 0.4933 | 0.9112 |
0.0651 | 8.8226 | 29000 | 0.0661 | 0.8791 | 0.5216 | 0.5189 | 0.9131 |
0.0675 | 9.1269 | 30000 | 0.0654 | 0.8900 | 0.5153 | 0.5070 | 0.9122 |
0.0683 | 9.4311 | 31000 | 0.0665 | 0.9097 | 0.5093 | 0.4951 | 0.9114 |
0.0651 | 9.7353 | 32000 | 0.0658 | 0.9053 | 0.5104 | 0.4975 | 0.9115 |
0.0706 | 10.0395 | 33000 | 0.0655 | 0.9104 | 0.5117 | 0.5000 | 0.9118 |
0.0661 | 10.3438 | 34000 | 0.0654 | 0.8783 | 0.5219 | 0.5196 | 0.9131 |
0.0655 | 10.6480 | 35000 | 0.0667 | 0.9045 | 0.5093 | 0.4951 | 0.9113 |
0.0657 | 10.9522 | 36000 | 0.0655 | 0.9105 | 0.5129 | 0.5023 | 0.9120 |
0.0667 | 11.2565 | 37000 | 0.0651 | 0.9033 | 0.5149 | 0.5061 | 0.9122 |
0.0658 | 11.5607 | 38000 | 0.0651 | 0.8713 | 0.5225 | 0.5208 | 0.9131 |
0.0641 | 11.8649 | 39000 | 0.0670 | 0.9050 | 0.5083 | 0.4933 | 0.9112 |
0.0671 | 12.1692 | 40000 | 0.0650 | 0.8917 | 0.5158 | 0.5079 | 0.9123 |
0.0645 | 12.4734 | 41000 | 0.0650 | 0.8749 | 0.5226 | 0.5208 | 0.9132 |
0.0671 | 12.7776 | 42000 | 0.0655 | 0.9095 | 0.5137 | 0.5039 | 0.9121 |
0.0667 | 13.0818 | 43000 | 0.0660 | 0.8478 | 0.5371 | 0.5472 | 0.9148 |
0.0647 | 13.3861 | 44000 | 0.0657 | 0.8644 | 0.5314 | 0.5371 | 0.9143 |
0.0634 | 13.6903 | 45000 | 0.0651 | 0.9086 | 0.5156 | 0.5075 | 0.9124 |
0.0676 | 13.9945 | 46000 | 0.0648 | 0.8749 | 0.5226 | 0.5208 | 0.9132 |
0.0671 | 14.2988 | 47000 | 0.0652 | 0.8641 | 0.5293 | 0.5333 | 0.9140 |
0.0656 | 14.6030 | 48000 | 0.0647 | 0.8753 | 0.5204 | 0.5167 | 0.9129 |
0.0652 | 14.9072 | 49000 | 0.0646 | 0.8740 | 0.5200 | 0.5160 | 0.9128 |
0.0646 | 15.2114 | 50000 | 0.0646 | 0.8732 | 0.5198 | 0.5156 | 0.9128 |
0.0632 | 15.5157 | 51000 | 0.0646 | 0.8763 | 0.5225 | 0.5206 | 0.9132 |
0.0626 | 15.8199 | 52000 | 0.0650 | 0.9055 | 0.5156 | 0.5075 | 0.9124 |
0.0643 | 16.1241 | 53000 | 0.0650 | 0.9072 | 0.5162 | 0.5086 | 0.9125 |
0.0651 | 16.4284 | 54000 | 0.0646 | 0.8730 | 0.5248 | 0.5249 | 0.9135 |
0.0669 | 16.7326 | 55000 | 0.0645 | 0.8749 | 0.5220 | 0.5198 | 0.9131 |
0.0679 | 17.0368 | 56000 | 0.0647 | 0.8759 | 0.5276 | 0.5301 | 0.9139 |
0.0665 | 17.3410 | 57000 | 0.0663 | 0.9113 | 0.5120 | 0.5005 | 0.9118 |
0.066 | 17.6453 | 58000 | 0.0645 | 0.8785 | 0.5202 | 0.5163 | 0.9129 |
0.0652 | 17.9495 | 59000 | 0.0645 | 0.8750 | 0.5261 | 0.5273 | 0.9137 |
0.0659 | 18.2537 | 60000 | 0.0647 | 0.8839 | 0.5185 | 0.5132 | 0.9127 |
0.0641 | 18.5580 | 61000 | 0.0645 | 0.8729 | 0.5208 | 0.5176 | 0.9129 |
0.0635 | 18.8622 | 62000 | 0.0645 | 0.8729 | 0.5208 | 0.5176 | 0.9129 |
0.0623 | 19.1664 | 63000 | 0.0645 | 0.8731 | 0.5226 | 0.5208 | 0.9132 |
0.0633 | 19.4706 | 64000 | 0.0644 | 0.8756 | 0.5234 | 0.5224 | 0.9133 |
0.0678 | 19.7749 | 65000 | 0.0644 | 0.8738 | 0.5228 | 0.5213 | 0.9132 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.1
- Downloads last month
- 11
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for whoisjones/fineweb-edu-scorer-mdeberta-binary
Base model
microsoft/mdeberta-v3-base