focal_modernbert_punctuation_128

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0194
  • Accuracy: 0.9786
  • Precision O: 0.9898
  • Recall O: 0.9918
  • F1 O: 0.9908
  • Precision Comma: 0.8413
  • Recall Comma: 0.8165
  • F1 Comma: 0.8288
  • Precision Period: 0.9055
  • Recall Period: 0.8971
  • F1 Period: 0.9013
  • Precision Question: 0.8218
  • Recall Question: 0.8171
  • F1 Question: 0.8195
  • Precision Exclamation: 0.0
  • Recall Exclamation: 0.0
  • F1 Exclamation: 0.0
  • Precision Macro: 0.8896
  • Recall Macro: 0.8807
  • F1 Macro: 0.8851

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 128
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 1024
  • total_eval_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision O Recall O F1 O Precision Comma Recall Comma F1 Comma Precision Period Recall Period F1 Period Precision Question Recall Question F1 Question Precision Exclamation Recall Exclamation F1 Exclamation Precision Macro Recall Macro F1 Macro
0.0716 0.0423 100 0.0668 0.9402 0.9607 0.9858 0.9731 0.5560 0.3841 0.4544 0.7854 0.6756 0.7264 0.7 0.04 0.0757 0.0 0.0 0.0 0.7505 0.5214 0.5574
0.0313 0.0846 200 0.0339 0.9652 0.9805 0.9891 0.9848 0.7709 0.6406 0.6997 0.8394 0.8548 0.8470 0.7697 0.6971 0.7316 0.0 0.0 0.0 0.8401 0.7954 0.8158
0.0261 0.1269 300 0.0292 0.9697 0.9848 0.9890 0.9869 0.7918 0.7127 0.7502 0.8486 0.8788 0.8634 0.7940 0.6829 0.7343 0.0 0.0 0.0 0.8548 0.8158 0.8337
0.027 0.1692 400 0.0288 0.9686 0.9842 0.9881 0.9861 0.7582 0.7493 0.7537 0.8842 0.8241 0.8531 0.7798 0.7486 0.7638 0.0 0.0 0.0 0.8516 0.8275 0.8392
0.0234 0.2115 500 0.0260 0.9718 0.9842 0.9912 0.9877 0.8064 0.7275 0.7649 0.8891 0.8593 0.8740 0.7923 0.7629 0.7773 0.0 0.0 0.0 0.8680 0.8352 0.8510
0.0221 0.2538 600 0.0246 0.9730 0.9835 0.9928 0.9881 0.8334 0.7188 0.7719 0.8922 0.8659 0.8789 0.8389 0.7143 0.7716 0.0 0.0 0.0 0.8870 0.8229 0.8526
0.021 0.2961 700 0.0235 0.9744 0.9876 0.9904 0.9890 0.8002 0.7894 0.7948 0.8997 0.8629 0.8809 0.8090 0.7743 0.7912 0.0 0.0 0.0 0.8741 0.8542 0.8640
0.0206 0.3384 800 0.0230 0.9744 0.9886 0.9891 0.9889 0.7990 0.7977 0.7984 0.8892 0.8802 0.8847 0.7452 0.7686 0.7567 0.0 0.0 0.0 0.8555 0.8589 0.8572
0.0197 0.3807 900 0.0224 0.9749 0.9876 0.9908 0.9892 0.8155 0.7815 0.7982 0.8919 0.8784 0.8851 0.8050 0.7314 0.7665 0.0 0.0 0.0 0.8750 0.8456 0.8597
0.0193 0.4230 1000 0.0221 0.9753 0.9886 0.9902 0.9894 0.8143 0.7862 0.8 0.8854 0.8937 0.8895 0.7970 0.7629 0.7796 0.0 0.0 0.0 0.8713 0.8582 0.8646
0.0193 0.4653 1100 0.0217 0.9756 0.9876 0.9913 0.9895 0.8160 0.7889 0.8022 0.9056 0.8729 0.8890 0.8116 0.7629 0.7865 0.0 0.0 0.0 0.8802 0.8540 0.8668
0.0194 0.5076 1200 0.0215 0.9757 0.9891 0.9901 0.9896 0.8160 0.7939 0.8048 0.8786 0.9002 0.8892 0.9046 0.6771 0.7745 0.0 0.0 0.0 0.8971 0.8403 0.8645
0.0187 0.5499 1300 0.0213 0.9761 0.9880 0.9913 0.9897 0.8294 0.7835 0.8058 0.8915 0.8916 0.8915 0.8339 0.7457 0.7873 0.0 0.0 0.0 0.8857 0.8530 0.8686
0.0186 0.5922 1400 0.0211 0.9758 0.9898 0.9894 0.9896 0.8009 0.8152 0.8080 0.8953 0.8880 0.8916 0.8224 0.7543 0.7869 0.0 0.0 0.0 0.8771 0.8617 0.8690
0.0182 0.6345 1500 0.0209 0.9762 0.9896 0.9901 0.9899 0.8028 0.8209 0.8117 0.9033 0.8777 0.8903 0.8475 0.7143 0.7752 0.0 0.0 0.0 0.8858 0.8507 0.8668
0.0182 0.6768 1600 0.0208 0.9763 0.9862 0.9932 0.9897 0.8497 0.7598 0.8022 0.9050 0.8830 0.8939 0.8065 0.7743 0.7901 0.0 0.0 0.0 0.8869 0.8526 0.8690
0.0177 0.7191 1700 0.0203 0.9766 0.9877 0.9922 0.9900 0.8478 0.7657 0.8047 0.8862 0.9011 0.8936 0.8 0.8229 0.8113 0.0 0.0 0.0 0.8804 0.8705 0.8749
0.0178 0.7614 1800 0.0202 0.9764 0.9889 0.9908 0.9899 0.8315 0.7896 0.8100 0.8835 0.9005 0.8919 0.8195 0.7914 0.8052 0.0 0.0 0.0 0.8809 0.8681 0.8742
0.0178 0.8037 1900 0.0202 0.9768 0.9877 0.9923 0.9900 0.8324 0.7941 0.8128 0.9101 0.8727 0.8910 0.8428 0.7657 0.8024 0.0 0.0 0.0 0.8932 0.8562 0.8741
0.0174 0.8460 2000 0.0206 0.9761 0.9858 0.9937 0.9898 0.8430 0.7642 0.8017 0.9159 0.8601 0.8871 0.7898 0.7943 0.7920 0.0 0.0 0.0 0.8836 0.8531 0.8676
0.0177 0.8883 2100 0.0199 0.9773 0.9892 0.9912 0.9902 0.8251 0.8109 0.8179 0.9032 0.8899 0.8965 0.8742 0.7543 0.8098 0.0 0.0 0.0 0.8979 0.8616 0.8786
0.0172 0.9306 2200 0.0197 0.9772 0.9900 0.9906 0.9903 0.8247 0.8136 0.8191 0.8948 0.8964 0.8956 0.7716 0.7914 0.7814 0.0 0.0 0.0 0.8703 0.8730 0.8716
0.0175 0.9729 2300 0.0196 0.9772 0.9882 0.9921 0.9902 0.8463 0.7843 0.8141 0.8949 0.8987 0.8968 0.7923 0.7629 0.7773 0.0 0.0 0.0 0.8804 0.8595 0.8696
0.0155 1.0152 2400 0.0195 0.9772 0.9893 0.9912 0.9902 0.8317 0.8026 0.8169 0.8945 0.8951 0.8948 0.7988 0.7829 0.7908 0.0 0.0 0.0 0.8786 0.8679 0.8732
0.0151 1.0575 2500 0.0196 0.9776 0.9885 0.9923 0.9904 0.8346 0.8031 0.8185 0.9120 0.8824 0.8970 0.8516 0.7543 0.8 0.0 0.0 0.0 0.8967 0.8580 0.8765
0.0151 1.0998 2600 0.0196 0.9772 0.9880 0.9925 0.9902 0.8465 0.7845 0.8143 0.8978 0.8913 0.8945 0.8241 0.7629 0.7923 0.0 0.0 0.0 0.8891 0.8578 0.8728
0.0152 1.1421 2700 0.0196 0.9773 0.9883 0.9923 0.9903 0.8279 0.8045 0.8160 0.9194 0.8715 0.8948 0.8109 0.8086 0.8097 0.0 0.0 0.0 0.8866 0.8692 0.8777
0.0152 1.1844 2800 0.0194 0.9774 0.9898 0.9910 0.9904 0.8227 0.8196 0.8212 0.9026 0.8851 0.8937 0.8065 0.7857 0.7959 0.0 0.0 0.0 0.8804 0.8704 0.8753
0.0153 1.2267 2900 0.0195 0.9774 0.9888 0.9918 0.9903 0.8313 0.8045 0.8177 0.9028 0.8910 0.8969 0.8901 0.6943 0.7801 0.0 0.0 0.0 0.9033 0.8454 0.8712
0.0153 1.2690 3000 0.0194 0.9776 0.9891 0.9917 0.9904 0.8323 0.8057 0.8188 0.9074 0.8919 0.8996 0.8047 0.7886 0.7965 0.0 0.0 0.0 0.8834 0.8695 0.8763
0.015 1.3113 3100 0.0193 0.9779 0.9898 0.9912 0.9905 0.8366 0.8077 0.8219 0.8945 0.9025 0.8985 0.7973 0.8314 0.8140 0.0 0.0 0.0 0.8795 0.8832 0.8812
0.0149 1.3536 3200 0.0193 0.9777 0.9895 0.9916 0.9906 0.8141 0.8315 0.8227 0.9258 0.8699 0.8970 0.8547 0.7229 0.7833 0.0 0.0 0.0 0.8960 0.8540 0.8734
0.0149 1.3959 3300 0.0190 0.9776 0.9899 0.9911 0.9905 0.8231 0.8207 0.8219 0.9053 0.8876 0.8964 0.8328 0.7829 0.8071 0.0 0.0 0.0 0.8878 0.8706 0.8790
0.0151 1.4382 3400 0.0191 0.9778 0.9885 0.9925 0.9905 0.8485 0.7888 0.8175 0.9010 0.8968 0.8989 0.8086 0.8086 0.8086 0.0 0.0 0.0 0.8866 0.8717 0.8789
0.0147 1.4805 3500 0.0190 0.9780 0.9889 0.9921 0.9905 0.8453 0.7972 0.8205 0.9005 0.9019 0.9012 0.8118 0.7886 0.8 0.0 0.0 0.0 0.8866 0.8699 0.8781
0.0149 1.5228 3600 0.0190 0.9780 0.9881 0.9930 0.9906 0.8478 0.7947 0.8204 0.9108 0.8854 0.8979 0.8318 0.7771 0.8035 0.0 0.0 0.0 0.8946 0.8626 0.8781
0.0153 1.5651 3700 0.0191 0.9780 0.9901 0.9911 0.9906 0.8183 0.8325 0.8253 0.9195 0.8802 0.8994 0.8086 0.8086 0.8086 0.0 0.0 0.0 0.8841 0.8781 0.8810
0.0147 1.6074 3800 0.0189 0.9784 0.9895 0.9921 0.9908 0.8434 0.8102 0.8265 0.9026 0.8956 0.8991 0.8230 0.7971 0.8099 0.0 0.0 0.0 0.8896 0.8737 0.8815
0.0146 1.6497 3900 0.0190 0.9781 0.9883 0.9930 0.9906 0.8443 0.8007 0.8219 0.9154 0.8810 0.8978 0.8364 0.7886 0.8118 0.0 0.0 0.0 0.8961 0.8658 0.8805
0.0148 1.6920 4000 0.0187 0.9783 0.9899 0.9916 0.9908 0.8318 0.8176 0.8246 0.9046 0.8952 0.8999 0.8517 0.7714 0.8096 0.0 0.0 0.0 0.8945 0.8690 0.8812
0.0145 1.7343 4100 0.0188 0.9782 0.9892 0.9923 0.9907 0.8415 0.8070 0.8239 0.9049 0.8911 0.8980 0.8328 0.7829 0.8071 0.0 0.0 0.0 0.8921 0.8683 0.8799
0.0143 1.7766 4200 0.0187 0.9782 0.9894 0.9920 0.9907 0.8319 0.8184 0.8251 0.9146 0.8843 0.8992 0.8293 0.7771 0.8024 0.0 0.0 0.0 0.8913 0.8680 0.8793
0.0147 1.8190 4300 0.0187 0.9785 0.9891 0.9925 0.9908 0.8392 0.8155 0.8272 0.9156 0.8834 0.8992 0.8492 0.7886 0.8178 0.0 0.0 0.0 0.8983 0.8700 0.8837
0.0145 1.8613 4400 0.0187 0.9782 0.9888 0.9925 0.9906 0.8421 0.8087 0.8251 0.9121 0.8853 0.8985 0.84 0.78 0.8089 0.0 0.0 0.0 0.8957 0.8666 0.8808
0.0145 1.9036 4500 0.0186 0.9783 0.9886 0.9926 0.9906 0.8498 0.7968 0.8225 0.9084 0.8957 0.9020 0.8068 0.8114 0.8091 0.0 0.0 0.0 0.8884 0.8741 0.8810
0.0144 1.9459 4600 0.0184 0.9785 0.9896 0.9920 0.9908 0.8417 0.8138 0.8275 0.9043 0.8956 0.8999 0.8434 0.8 0.8211 0.0 0.0 0.0 0.8948 0.8754 0.8848
0.0143 1.9882 4700 0.0183 0.9786 0.9893 0.9924 0.9908 0.8471 0.8089 0.8275 0.9060 0.8962 0.9011 0.8399 0.7943 0.8164 0.0 0.0 0.0 0.8955 0.8729 0.8840
0.012 2.0305 4800 0.0194 0.9785 0.9898 0.9918 0.9908 0.8426 0.8134 0.8278 0.9008 0.8994 0.9001 0.8246 0.8057 0.8150 0.0 0.0 0.0 0.8894 0.8776 0.8834
0.012 2.0728 4900 0.0192 0.9783 0.9895 0.9919 0.9907 0.8452 0.8062 0.8252 0.8996 0.9013 0.9004 0.8074 0.8143 0.8108 0.0 0.0 0.0 0.8854 0.8784 0.8818
0.0123 2.1151 5000 0.0192 0.9785 0.9897 0.9918 0.9907 0.8362 0.8212 0.8287 0.9101 0.8900 0.8999 0.8313 0.7886 0.8094 0.0 0.0 0.0 0.8918 0.8729 0.8822
0.0119 2.1574 5100 0.0197 0.9785 0.9904 0.9912 0.9908 0.8293 0.8289 0.8291 0.9076 0.8935 0.9005 0.8299 0.8086 0.8191 0.0 0.0 0.0 0.8893 0.8805 0.8849
0.0118 2.1997 5200 0.0196 0.9781 0.9899 0.9913 0.9906 0.8306 0.8207 0.8256 0.9057 0.8948 0.9002 0.8333 0.7857 0.8088 0.0 0.0 0.0 0.8899 0.8731 0.8813
0.0119 2.2420 5300 0.0196 0.9785 0.9903 0.9913 0.9908 0.8341 0.8220 0.8280 0.9016 0.9014 0.9015 0.8349 0.78 0.8065 0.0 0.0 0.0 0.8902 0.8737 0.8817
0.0117 2.2843 5400 0.0197 0.9784 0.9904 0.9911 0.9907 0.8371 0.8183 0.8276 0.8938 0.9065 0.9001 0.8323 0.7943 0.8129 0.0 0.0 0.0 0.8884 0.8775 0.8828
0.0119 2.3266 5500 0.0193 0.9786 0.9899 0.9917 0.9908 0.8400 0.8175 0.8286 0.9035 0.9005 0.9020 0.8406 0.7686 0.8030 0.0 0.0 0.0 0.8935 0.8696 0.8811
0.0119 2.3689 5600 0.0195 0.9786 0.9902 0.9915 0.9909 0.8353 0.8259 0.8306 0.9073 0.8938 0.9005 0.8174 0.8057 0.8115 0.0 0.0 0.0 0.8876 0.8792 0.8834
0.0117 2.4112 5700 0.0195 0.9784 0.9899 0.9916 0.9907 0.8339 0.8211 0.8275 0.9079 0.8937 0.9007 0.8408 0.8 0.8199 0.0 0.0 0.0 0.8931 0.8766 0.8847
0.0119 2.4535 5800 0.0194 0.9785 0.9899 0.9917 0.9908 0.8432 0.8140 0.8283 0.9007 0.9013 0.9010 0.8113 0.8229 0.8170 0.0 0.0 0.0 0.8862 0.8824 0.8843
0.012 2.4958 5900 0.0195 0.9785 0.9895 0.9919 0.9907 0.8370 0.8176 0.8272 0.9137 0.8911 0.9023 0.8067 0.8229 0.8147 0.0 0.0 0.0 0.8867 0.8809 0.8837
0.0118 2.5381 6000 0.0193 0.9785 0.9900 0.9915 0.9907 0.8349 0.8214 0.8281 0.9086 0.8975 0.9030 0.8246 0.8057 0.8150 0.0 0.0 0.0 0.8895 0.8790 0.8842
0.0116 2.5804 6100 0.0195 0.9786 0.9896 0.9919 0.9908 0.8433 0.8132 0.8279 0.9056 0.8981 0.9018 0.8174 0.8314 0.8244 0.0 0.0 0.0 0.8890 0.8837 0.8862
0.0118 2.6227 6200 0.0194 0.9785 0.9896 0.9918 0.9907 0.8394 0.8146 0.8268 0.9073 0.8976 0.9024 0.8358 0.8143 0.8249 0.0 0.0 0.0 0.8930 0.8796 0.8862
0.0117 2.6650 6300 0.0194 0.9786 0.9897 0.9919 0.9908 0.8422 0.8167 0.8292 0.9067 0.8976 0.9021 0.8218 0.8171 0.8195 0.0 0.0 0.0 0.8901 0.8808 0.8854
0.0116 2.7073 6400 0.0194 0.9786 0.9895 0.9921 0.9908 0.8421 0.8128 0.8272 0.9069 0.8959 0.9014 0.8333 0.8143 0.8237 0.0 0.0 0.0 0.8930 0.8788 0.8858
0.0114 2.7496 6500 0.0194 0.9786 0.9901 0.9915 0.9908 0.8382 0.8192 0.8286 0.9029 0.9002 0.9015 0.8368 0.8057 0.8210 0.0 0.0 0.0 0.8920 0.8792 0.8855
0.0116 2.7919 6600 0.0194 0.9786 0.9897 0.9919 0.9908 0.8426 0.8150 0.8286 0.9045 0.8981 0.9013 0.8213 0.8143 0.8178 0.0 0.0 0.0 0.8896 0.8798 0.8846
0.0118 2.8342 6700 0.0194 0.9786 0.9899 0.9917 0.9908 0.8398 0.8176 0.8285 0.9045 0.8976 0.9010 0.8251 0.8086 0.8167 0.0 0.0 0.0 0.8898 0.8789 0.8843
0.0118 2.8765 6800 0.0194 0.9785 0.9898 0.9918 0.9908 0.8402 0.8167 0.8283 0.9052 0.8973 0.9012 0.8275 0.8086 0.8179 0.0 0.0 0.0 0.8907 0.8786 0.8845
0.0117 2.9188 6900 0.0194 0.9786 0.9897 0.9919 0.9908 0.8414 0.8164 0.8287 0.9062 0.8968 0.9015 0.8242 0.8171 0.8207 0.0 0.0 0.0 0.8904 0.8806 0.8854
0.0117 2.9611 7000 0.0194 0.9786 0.9898 0.9918 0.9908 0.8413 0.8165 0.8288 0.9055 0.8971 0.9013 0.8218 0.8171 0.8195 0.0 0.0 0.0 0.8896 0.8807 0.8851

Framework versions

  • Transformers 4.49.0.dev0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.0
  • Tokenizers 0.21.0
Downloads last month
11
Safetensors
Model size
396M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for whooray/focal_modernbert_punctuation_128

Finetuned
(75)
this model