--- library_name: transformers base_model: csebuetnlp/banglabert tags: - generated_from_trainer model-index: - name: bengali_qa_model_AGGRO_banglabert results: [] --- # bengali_qa_model_AGGRO_banglabert This model is a fine-tuned version of [csebuetnlp/banglabert](https://huggingface.co/csebuetnlp/banglabert) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.2676 - Exact Match: 98.5714 - F1 Score: 99.0056 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 3407 - gradient_accumulation_steps: 16 - total_train_batch_size: 64 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - training_steps: 100 ### Training results | Training Loss | Epoch | Step | Validation Loss | Exact Match | F1 Score | |:-------------:|:------:|:----:|:---------------:|:-----------:|:--------:| | 6.0126 | 0.0053 | 1 | 5.9783 | 0.0 | 0.6103 | | 6.0125 | 0.0107 | 2 | 5.9540 | 0.0 | 0.7848 | | 5.9675 | 0.0160 | 3 | 5.9074 | 0.0 | 0.9597 | | 5.9287 | 0.0214 | 4 | 5.8425 | 0.0 | 1.7507 | | 5.8586 | 0.0267 | 5 | 5.7636 | 0.1504 | 4.2535 | | 5.8206 | 0.0321 | 6 | 5.6740 | 0.4511 | 11.2628 | | 5.7246 | 0.0374 | 7 | 5.5749 | 1.7293 | 23.3816 | | 5.634 | 0.0428 | 8 | 5.4574 | 3.9850 | 37.7873 | | 5.4963 | 0.0481 | 9 | 5.3105 | 5.7895 | 47.4987 | | 5.2985 | 0.0535 | 10 | 5.1265 | 7.5940 | 52.0471 | | 5.182 | 0.0588 | 11 | 4.8997 | 11.9549 | 54.5555 | | 4.973 | 0.0641 | 12 | 4.6631 | 15.9398 | 56.5530 | | 4.8353 | 0.0695 | 13 | 4.4348 | 19.6241 | 58.6313 | | 4.6269 | 0.0748 | 14 | 4.2322 | 23.5338 | 60.7029 | | 4.4238 | 0.0802 | 15 | 4.0467 | 28.0451 | 62.7494 | | 4.1976 | 0.0855 | 16 | 3.8781 | 32.7068 | 64.6375 | | 4.1302 | 0.0909 | 17 | 3.7200 | 35.7895 | 66.2513 | | 3.9139 | 0.0962 | 18 | 3.5621 | 39.5489 | 67.7758 | | 3.8521 | 0.1016 | 19 | 3.4019 | 43.0075 | 69.2899 | | 3.7003 | 0.1069 | 20 | 3.2534 | 46.0150 | 70.6373 | | 3.5972 | 0.1123 | 21 | 3.1168 | 48.8722 | 72.3043 | | 3.5249 | 0.1176 | 22 | 2.9875 | 51.5038 | 73.2903 | | 3.1756 | 0.1230 | 23 | 2.8600 | 53.6090 | 74.1609 | | 3.2323 | 0.1283 | 24 | 2.7356 | 55.2632 | 74.8864 | | 3.0696 | 0.1336 | 25 | 2.6150 | 56.8421 | 75.8938 | | 2.9806 | 0.1390 | 26 | 2.5029 | 58.9474 | 77.3831 | | 2.8261 | 0.1443 | 27 | 2.3997 | 61.1278 | 78.8467 | | 2.8965 | 0.1497 | 28 | 2.3045 | 63.9098 | 80.8890 | | 2.6622 | 0.1550 | 29 | 2.2151 | 66.0902 | 82.4263 | | 2.5132 | 0.1604 | 30 | 2.1300 | 68.0451 | 83.8984 | | 2.5076 | 0.1657 | 31 | 2.0482 | 70.9774 | 85.4846 | | 2.2189 | 0.1711 | 32 | 1.9678 | 72.7068 | 86.2628 | | 2.0851 | 0.1764 | 33 | 1.8883 | 75.8647 | 87.8992 | | 2.1198 | 0.1818 | 34 | 1.8091 | 78.5714 | 89.4148 | | 2.0272 | 0.1871 | 35 | 1.7300 | 80.8271 | 90.3877 | | 1.9951 | 0.1924 | 36 | 1.6514 | 82.7068 | 91.2138 | | 1.7741 | 0.1978 | 37 | 1.5736 | 84.9624 | 91.8920 | | 1.9176 | 0.2031 | 38 | 1.4970 | 86.5414 | 92.3250 | | 1.8599 | 0.2085 | 39 | 1.4219 | 87.5940 | 92.7578 | | 1.8095 | 0.2138 | 40 | 1.3496 | 88.5714 | 93.0980 | | 1.7814 | 0.2192 | 41 | 1.2790 | 90.0752 | 93.7737 | | 1.4602 | 0.2245 | 42 | 1.2103 | 91.5038 | 94.6447 | | 1.5147 | 0.2299 | 43 | 1.1431 | 92.1805 | 95.1039 | | 1.4205 | 0.2352 | 44 | 1.0774 | 92.9323 | 95.4111 | | 1.3222 | 0.2406 | 45 | 1.0127 | 93.9850 | 96.0199 | | 1.2477 | 0.2459 | 46 | 0.9508 | 94.8120 | 96.5219 | | 1.1406 | 0.2513 | 47 | 0.8936 | 95.2632 | 96.8391 | | 1.1698 | 0.2566 | 48 | 0.8382 | 96.3158 | 97.5331 | | 1.1359 | 0.2619 | 49 | 0.7847 | 97.0677 | 97.9841 | | 1.1811 | 0.2673 | 50 | 0.7324 | 97.5940 | 98.4006 | | 0.9734 | 0.2726 | 51 | 0.6814 | 97.7444 | 98.5321 | | 0.928 | 0.2780 | 52 | 0.6318 | 97.8947 | 98.6140 | | 0.8989 | 0.2833 | 53 | 0.5859 | 98.1203 | 98.7571 | | 0.7784 | 0.2887 | 54 | 0.5430 | 98.3459 | 98.9243 | | 1.0015 | 0.2940 | 55 | 0.5027 | 98.3459 | 98.8914 | | 0.7509 | 0.2994 | 56 | 0.4656 | 98.5714 | 99.0811 | | 0.6838 | 0.3047 | 57 | 0.4328 | 98.7970 | 99.1723 | | 0.7336 | 0.3101 | 58 | 0.4042 | 98.8722 | 99.1327 | | 0.5729 | 0.3154 | 59 | 0.3781 | 98.9474 | 99.2079 | | 0.5891 | 0.3207 | 60 | 0.3538 | 99.0226 | 99.3362 | | 0.6168 | 0.3261 | 61 | 0.3322 | 99.1729 | 99.4169 | | 0.5503 | 0.3314 | 62 | 0.3130 | 99.1729 | 99.4169 | | 0.5058 | 0.3368 | 63 | 0.2955 | 99.1729 | 99.4169 | | 0.4065 | 0.3421 | 64 | 0.2788 | 99.3233 | 99.5000 | | 0.4466 | 0.3475 | 65 | 0.2638 | 99.2481 | 99.4981 | | 0.4727 | 0.3528 | 66 | 0.2496 | 99.2481 | 99.4981 | | 0.45 | 0.3582 | 67 | 0.2365 | 99.2481 | 99.4981 | ### Framework versions - Transformers 4.46.3 - Pytorch 2.4.0 - Datasets 3.1.0 - Tokenizers 0.20.3