my_awesome_qa_model
This model is a fine-tuned version of google/muril-base-cased on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.2420
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 30
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
No log | 1.0 | 87 | 3.3597 |
3.9557 | 2.0 | 174 | 2.1384 |
2.3734 | 3.0 | 261 | 1.3267 |
1.2469 | 4.0 | 348 | 1.0861 |
0.709 | 5.0 | 435 | 1.0629 |
0.4988 | 6.0 | 522 | 1.6941 |
0.3718 | 7.0 | 609 | 1.3660 |
0.3718 | 8.0 | 696 | 2.0104 |
0.2292 | 9.0 | 783 | 2.1057 |
0.1848 | 10.0 | 870 | 2.1225 |
0.1241 | 11.0 | 957 | 1.8473 |
0.1352 | 12.0 | 1044 | 1.5934 |
0.0767 | 13.0 | 1131 | 1.7822 |
0.0589 | 14.0 | 1218 | 1.9077 |
0.0502 | 15.0 | 1305 | 1.9062 |
0.0502 | 16.0 | 1392 | 1.9073 |
0.0559 | 17.0 | 1479 | 1.9963 |
0.0441 | 18.0 | 1566 | 1.7880 |
0.0296 | 19.0 | 1653 | 2.3304 |
0.0204 | 20.0 | 1740 | 2.3634 |
0.0165 | 21.0 | 1827 | 2.1404 |
0.0152 | 22.0 | 1914 | 1.8899 |
0.01 | 23.0 | 2001 | 2.0763 |
0.01 | 24.0 | 2088 | 2.2466 |
0.0079 | 25.0 | 2175 | 2.2306 |
0.0072 | 26.0 | 2262 | 2.2067 |
0.0128 | 27.0 | 2349 | 2.2512 |
0.0113 | 28.0 | 2436 | 2.2725 |
0.0062 | 29.0 | 2523 | 2.2457 |
0.0064 | 30.0 | 2610 | 2.2420 |
Framework versions
- Transformers 4.52.4
- Pytorch 2.6.0+cu124
- Datasets 2.14.4
- Tokenizers 0.21.1
- Downloads last month
- 5
Model tree for shlok123/my_awesome_qa_model
Base model
google/muril-base-cased