QA_BERT_40_epoch

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 5.2702

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 500

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 2 5.5836
No log 2.0 4 4.9665
No log 3.0 6 4.3529
No log 4.0 8 3.8760
No log 5.0 10 3.6382
No log 6.0 12 3.5459
No log 7.0 14 3.5174
No log 8.0 16 3.4454
No log 9.0 18 3.3175
No log 10.0 20 3.2621
No log 11.0 22 3.2679
No log 12.0 24 3.2291
No log 13.0 26 3.2395
No log 14.0 28 3.2465
No log 15.0 30 3.2500
No log 16.0 32 3.2689
No log 17.0 34 3.3377
No log 18.0 36 3.3268
No log 19.0 38 3.3848
No log 20.0 40 3.4531
No log 21.0 42 3.4366
No log 22.0 44 3.3987
No log 23.0 46 3.5183
No log 24.0 48 3.5116
No log 25.0 50 3.4385
No log 26.0 52 3.5282
No log 27.0 54 3.5057
No log 28.0 56 3.4458
No log 29.0 58 3.5851
No log 30.0 60 3.5581
No log 31.0 62 3.5042
No log 32.0 64 3.6945
No log 33.0 66 3.6743
No log 34.0 68 3.5845
No log 35.0 70 3.7131
No log 36.0 72 3.7195
No log 37.0 74 3.5873
No log 38.0 76 3.6404
No log 39.0 78 3.7562
No log 40.0 80 3.6104
No log 41.0 82 3.6501
No log 42.0 84 3.7878
No log 43.0 86 3.6725
No log 44.0 88 3.7462
No log 45.0 90 3.7865
No log 46.0 92 3.7666
No log 47.0 94 3.9332
No log 48.0 96 3.9917
No log 49.0 98 3.9490
No log 50.0 100 3.9449
No log 51.0 102 4.0781
No log 52.0 104 4.2149
No log 53.0 106 4.1135
No log 54.0 108 4.2498
No log 55.0 110 4.4105
No log 56.0 112 4.3640
No log 57.0 114 4.2132
No log 58.0 116 4.2122
No log 59.0 118 4.3662
No log 60.0 120 4.4326
No log 61.0 122 4.3963
No log 62.0 124 4.3008
No log 63.0 126 4.4562
No log 64.0 128 4.5806
No log 65.0 130 4.5211
No log 66.0 132 4.5000
No log 67.0 134 4.4879
No log 68.0 136 4.7343
No log 69.0 138 4.6160
No log 70.0 140 4.6651
No log 71.0 142 4.7094
No log 72.0 144 4.7363
No log 73.0 146 4.7983
No log 74.0 148 4.6931
No log 75.0 150 4.8674
No log 76.0 152 4.6793
No log 77.0 154 4.4066
No log 78.0 156 4.4354
No log 79.0 158 4.5461
No log 80.0 160 4.6051
No log 81.0 162 4.6052
No log 82.0 164 4.6206
No log 83.0 166 4.6426
No log 84.0 168 4.6186
No log 85.0 170 4.6823
No log 86.0 172 4.6801
No log 87.0 174 4.6715
No log 88.0 176 4.6989
No log 89.0 178 4.7356
No log 90.0 180 4.7391
No log 91.0 182 4.7629
No log 92.0 184 4.8833
No log 93.0 186 4.6954
No log 94.0 188 4.6447
No log 95.0 190 4.8187
No log 96.0 192 4.9326
No log 97.0 194 4.9601
No log 98.0 196 4.9014
No log 99.0 198 4.8017
No log 100.0 200 4.7338
No log 101.0 202 4.7982
No log 102.0 204 4.8642
No log 103.0 206 4.8707
No log 104.0 208 4.6649
No log 105.0 210 4.5818
No log 106.0 212 4.7099
No log 107.0 214 4.8475
No log 108.0 216 4.8508
No log 109.0 218 4.7317
No log 110.0 220 4.6941
No log 111.0 222 4.7462
No log 112.0 224 4.7574
No log 113.0 226 4.7973
No log 114.0 228 4.8107
No log 115.0 230 4.7634
No log 116.0 232 4.7674
No log 117.0 234 4.8727
No log 118.0 236 4.9352
No log 119.0 238 5.0857
No log 120.0 240 5.0789
No log 121.0 242 5.0191
No log 122.0 244 4.9788
No log 123.0 246 4.9839
No log 124.0 248 5.0862
No log 125.0 250 5.1913
No log 126.0 252 5.1812
No log 127.0 254 5.0316
No log 128.0 256 4.9544
No log 129.0 258 4.9150
No log 130.0 260 4.9089
No log 131.0 262 4.9668
No log 132.0 264 4.9633
No log 133.0 266 4.9144
No log 134.0 268 4.9101
No log 135.0 270 4.9025
No log 136.0 272 5.0318
No log 137.0 274 5.1524
No log 138.0 276 5.1386
No log 139.0 278 4.9144
No log 140.0 280 5.0827
No log 141.0 282 5.0240
No log 142.0 284 4.9660
No log 143.0 286 5.3053
No log 144.0 288 5.2233
No log 145.0 290 5.0342
No log 146.0 292 4.9655
No log 147.0 294 5.0220
No log 148.0 296 5.0608
No log 149.0 298 5.0743
No log 150.0 300 4.9205
No log 151.0 302 4.9447
No log 152.0 304 4.9659
No log 153.0 306 4.9111
No log 154.0 308 4.9579
No log 155.0 310 5.0062
No log 156.0 312 5.1133
No log 157.0 314 5.2901
No log 158.0 316 5.3683
No log 159.0 318 5.4181
No log 160.0 320 5.2470
No log 161.0 322 5.1100
No log 162.0 324 5.1077
No log 163.0 326 5.3202
No log 164.0 328 5.3461
No log 165.0 330 5.1452
No log 166.0 332 5.0381
No log 167.0 334 5.0034
No log 168.0 336 4.8378
No log 169.0 338 4.9186
No log 170.0 340 5.1349
No log 171.0 342 5.5404
No log 172.0 344 5.5171
No log 173.0 346 5.1810
No log 174.0 348 4.8270
No log 175.0 350 4.7501
No log 176.0 352 4.8160
No log 177.0 354 4.9254
No log 178.0 356 5.0300
No log 179.0 358 5.2502
No log 180.0 360 5.2969
No log 181.0 362 5.1923
No log 182.0 364 5.0364
No log 183.0 366 4.9416
No log 184.0 368 4.9631
No log 185.0 370 5.0063
No log 186.0 372 5.1270
No log 187.0 374 5.2420
No log 188.0 376 5.2554
No log 189.0 378 5.1474
No log 190.0 380 5.0355
No log 191.0 382 5.0567
No log 192.0 384 4.9559
No log 193.0 386 5.1435
No log 194.0 388 5.3238
No log 195.0 390 5.3907
No log 196.0 392 5.3311
No log 197.0 394 5.2927
No log 198.0 396 5.2811
No log 199.0 398 5.2599
No log 200.0 400 5.2294
No log 201.0 402 5.1436
No log 202.0 404 5.0589
No log 203.0 406 5.0693
No log 204.0 408 5.1252
No log 205.0 410 5.1331
No log 206.0 412 5.1748
No log 207.0 414 5.2408
No log 208.0 416 5.2533
No log 209.0 418 5.1634
No log 210.0 420 5.0228
No log 211.0 422 4.9801
No log 212.0 424 5.0686
No log 213.0 426 5.2167
No log 214.0 428 5.3111
No log 215.0 430 5.2456
No log 216.0 432 5.1329
No log 217.0 434 5.0976
No log 218.0 436 5.1051
No log 219.0 438 5.1902
No log 220.0 440 5.2540
No log 221.0 442 5.3165
No log 222.0 444 5.3993
No log 223.0 446 5.4198
No log 224.0 448 5.3331
No log 225.0 450 5.1872
No log 226.0 452 5.1843
No log 227.0 454 5.1687
No log 228.0 456 5.0710
No log 229.0 458 5.0474
No log 230.0 460 5.1511
No log 231.0 462 5.2742
No log 232.0 464 5.3806
No log 233.0 466 5.4119
No log 234.0 468 5.4155
No log 235.0 470 5.3472
No log 236.0 472 5.1764
No log 237.0 474 5.0515
No log 238.0 476 4.9411
No log 239.0 478 5.0025
No log 240.0 480 5.1235
No log 241.0 482 5.1967
No log 242.0 484 5.1897
No log 243.0 486 5.1283
No log 244.0 488 5.0631
No log 245.0 490 4.8919
No log 246.0 492 4.7609
No log 247.0 494 4.7219
No log 248.0 496 4.7634
No log 249.0 498 4.8325
0.7832 250.0 500 4.9378
0.7832 251.0 502 4.9677
0.7832 252.0 504 4.7211
0.7832 253.0 506 4.9088
0.7832 254.0 508 4.9177
0.7832 255.0 510 4.8639
0.7832 256.0 512 5.1184
0.7832 257.0 514 5.2668
0.7832 258.0 516 5.1835
0.7832 259.0 518 5.0206
0.7832 260.0 520 4.9770
0.7832 261.0 522 5.0736
0.7832 262.0 524 5.1353
0.7832 263.0 526 5.1106
0.7832 264.0 528 5.1155
0.7832 265.0 530 5.1459
0.7832 266.0 532 5.2114
0.7832 267.0 534 5.3258
0.7832 268.0 536 5.3890
0.7832 269.0 538 5.3609
0.7832 270.0 540 5.3051
0.7832 271.0 542 5.2518
0.7832 272.0 544 5.2327
0.7832 273.0 546 5.2061
0.7832 274.0 548 5.2223
0.7832 275.0 550 5.2319
0.7832 276.0 552 5.2454
0.7832 277.0 554 5.2423
0.7832 278.0 556 5.2425
0.7832 279.0 558 5.2659
0.7832 280.0 560 5.2719
0.7832 281.0 562 5.2720
0.7832 282.0 564 5.3218
0.7832 283.0 566 5.3582
0.7832 284.0 568 5.3460
0.7832 285.0 570 5.3295
0.7832 286.0 572 5.3079
0.7832 287.0 574 5.2975
0.7832 288.0 576 5.3203
0.7832 289.0 578 5.2335
0.7832 290.0 580 5.2119
0.7832 291.0 582 5.5726
0.7832 292.0 584 6.0619
0.7832 293.0 586 5.7864
0.7832 294.0 588 5.3725
0.7832 295.0 590 5.1521
0.7832 296.0 592 5.1235
0.7832 297.0 594 5.0175
0.7832 298.0 596 5.0574
0.7832 299.0 598 5.2451
0.7832 300.0 600 5.3431
0.7832 301.0 602 5.3769
0.7832 302.0 604 5.2665
0.7832 303.0 606 5.2002
0.7832 304.0 608 5.2058
0.7832 305.0 610 5.2341
0.7832 306.0 612 5.2454
0.7832 307.0 614 5.2517
0.7832 308.0 616 5.2628
0.7832 309.0 618 5.2776
0.7832 310.0 620 5.2929
0.7832 311.0 622 5.3052
0.7832 312.0 624 5.3247
0.7832 313.0 626 5.3325
0.7832 314.0 628 5.3221
0.7832 315.0 630 5.3057
0.7832 316.0 632 5.2909
0.7832 317.0 634 5.2786
0.7832 318.0 636 5.2701
0.7832 319.0 638 5.2690
0.7832 320.0 640 5.2641
0.7832 321.0 642 5.2536
0.7832 322.0 644 5.2575
0.7832 323.0 646 5.2663
0.7832 324.0 648 5.2774
0.7832 325.0 650 5.2961
0.7832 326.0 652 5.3104
0.7832 327.0 654 5.3289
0.7832 328.0 656 5.3471
0.7832 329.0 658 5.3710
0.7832 330.0 660 5.3863
0.7832 331.0 662 5.3941
0.7832 332.0 664 5.3804
0.7832 333.0 666 5.3571
0.7832 334.0 668 5.3233
0.7832 335.0 670 5.2989
0.7832 336.0 672 5.2867
0.7832 337.0 674 5.2822
0.7832 338.0 676 5.2842
0.7832 339.0 678 5.2866
0.7832 340.0 680 5.2857
0.7832 341.0 682 5.2811
0.7832 342.0 684 5.2718
0.7832 343.0 686 5.2670
0.7832 344.0 688 5.2515
0.7832 345.0 690 5.2412
0.7832 346.0 692 5.2758
0.7832 347.0 694 5.3152
0.7832 348.0 696 5.3359
0.7832 349.0 698 5.3116
0.7832 350.0 700 5.3171
0.7832 351.0 702 5.3231
0.7832 352.0 704 5.3244
0.7832 353.0 706 5.2968
0.7832 354.0 708 5.2512
0.7832 355.0 710 5.2017
0.7832 356.0 712 5.1600
0.7832 357.0 714 5.1562
0.7832 358.0 716 5.1819
0.7832 359.0 718 5.2067
0.7832 360.0 720 5.2122
0.7832 361.0 722 5.1902
0.7832 362.0 724 5.1657
0.7832 363.0 726 5.1438
0.7832 364.0 728 5.1423
0.7832 365.0 730 5.1638
0.7832 366.0 732 5.1772
0.7832 367.0 734 5.1856
0.7832 368.0 736 5.1777
0.7832 369.0 738 5.1616
0.7832 370.0 740 5.1377
0.7832 371.0 742 5.1160
0.7832 372.0 744 5.1196
0.7832 373.0 746 5.1877
0.7832 374.0 748 5.2327
0.7832 375.0 750 5.3493
0.7832 376.0 752 5.4902
0.7832 377.0 754 5.5718
0.7832 378.0 756 5.6042
0.7832 379.0 758 5.5747
0.7832 380.0 760 5.4605
0.7832 381.0 762 5.3356
0.7832 382.0 764 5.2469
0.7832 383.0 766 5.1889
0.7832 384.0 768 5.1443
0.7832 385.0 770 5.1132
0.7832 386.0 772 5.0938
0.7832 387.0 774 5.0783
0.7832 388.0 776 5.0807
0.7832 389.0 778 5.0945
0.7832 390.0 780 5.1039
0.7832 391.0 782 5.1355
0.7832 392.0 784 5.1741
0.7832 393.0 786 5.2368
0.7832 394.0 788 5.3176
0.7832 395.0 790 5.3733
0.7832 396.0 792 5.4010
0.7832 397.0 794 5.3962
0.7832 398.0 796 5.3758
0.7832 399.0 798 5.3272
0.7832 400.0 800 5.2777
0.7832 401.0 802 5.2432
0.7832 402.0 804 5.2261
0.7832 403.0 806 5.2164
0.7832 404.0 808 5.2188
0.7832 405.0 810 5.2323
0.7832 406.0 812 5.2483
0.7832 407.0 814 5.2813
0.7832 408.0 816 5.3437
0.7832 409.0 818 5.4606
0.7832 410.0 820 5.5911
0.7832 411.0 822 5.6450
0.7832 412.0 824 5.5630
0.7832 413.0 826 5.4683
0.7832 414.0 828 5.3558
0.7832 415.0 830 5.2639
0.7832 416.0 832 5.2379
0.7832 417.0 834 5.2238
0.7832 418.0 836 5.2130
0.7832 419.0 838 5.1871
0.7832 420.0 840 5.1823
0.7832 421.0 842 5.1769
0.7832 422.0 844 5.1814
0.7832 423.0 846 5.1979
0.7832 424.0 848 5.2140
0.7832 425.0 850 5.2258
0.7832 426.0 852 5.2430
0.7832 427.0 854 5.2550
0.7832 428.0 856 5.2620
0.7832 429.0 858 5.2588
0.7832 430.0 860 5.2557
0.7832 431.0 862 5.2543
0.7832 432.0 864 5.2332
0.7832 433.0 866 5.2196
0.7832 434.0 868 5.2143
0.7832 435.0 870 5.2092
0.7832 436.0 872 5.2010
0.7832 437.0 874 5.1949
0.7832 438.0 876 5.1995
0.7832 439.0 878 5.2246
0.7832 440.0 880 5.2296
0.7832 441.0 882 5.2022
0.7832 442.0 884 5.1874
0.7832 443.0 886 5.3347
0.7832 444.0 888 5.4867
0.7832 445.0 890 5.5504
0.7832 446.0 892 5.5327
0.7832 447.0 894 5.4747
0.7832 448.0 896 5.4291
0.7832 449.0 898 5.3723
0.7832 450.0 900 5.3292
0.7832 451.0 902 5.2965
0.7832 452.0 904 5.2549
0.7832 453.0 906 5.2493
0.7832 454.0 908 5.2140
0.7832 455.0 910 5.2419
0.7832 456.0 912 5.3364
0.7832 457.0 914 5.4087
0.7832 458.0 916 5.4277
0.7832 459.0 918 5.2826
0.7832 460.0 920 5.1621
0.7832 461.0 922 5.0967
0.7832 462.0 924 5.0656
0.7832 463.0 926 5.0777
0.7832 464.0 928 5.1370
0.7832 465.0 930 5.1932
0.7832 466.0 932 5.2190
0.7832 467.0 934 5.2236
0.7832 468.0 936 5.2225
0.7832 469.0 938 5.2229
0.7832 470.0 940 5.2252
0.7832 471.0 942 5.2202
0.7832 472.0 944 5.2417
0.7832 473.0 946 5.2864
0.7832 474.0 948 5.3279
0.7832 475.0 950 5.3707
0.7832 476.0 952 5.3521
0.7832 477.0 954 5.3306
0.7832 478.0 956 5.3098
0.7832 479.0 958 5.2887
0.7832 480.0 960 5.2756
0.7832 481.0 962 5.3118
0.7832 482.0 964 5.4077
0.7832 483.0 966 5.5179
0.7832 484.0 968 5.6063
0.7832 485.0 970 5.6672
0.7832 486.0 972 5.6948
0.7832 487.0 974 5.7026
0.7832 488.0 976 5.6912
0.7832 489.0 978 5.6624
0.7832 490.0 980 5.7197
0.7832 491.0 982 5.7408
0.7832 492.0 984 5.7196
0.7832 493.0 986 5.6736
0.7832 494.0 988 5.5864
0.7832 495.0 990 5.4609
0.7832 496.0 992 5.3666
0.7832 497.0 994 5.2701
0.7832 498.0 996 5.2414
0.7832 499.0 998 5.2953
0.4349 500.0 1000 5.2702

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
6
Safetensors
Model size
108M params
Tensor type
F32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support