mt5-small-finetuned-cnn_dailymail
This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.5244
- Rouge1: 23.8806
- Rouge2: 11.7122
- Rougel: 20.1043
- Rougelsum: 22.5041
- Bleu 1: 3.5889
- Bleu 2: 2.411
- Bleu 3: 1.7466
- Meteor: 11.8919
- Lungime rezumat: 11.496
- Lungime original: 46.991
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5.6e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu 1 | Bleu 2 | Bleu 3 | Meteor | Lungime rezumat | Lungime original |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2.9989 | 1.0 | 3583 | 1.6617 | 21.745 | 9.6834 | 17.6 | 20.1315 | 3.1902 | 2.0591 | 1.4759 | 10.57 | 11.408 | 46.991 |
1.8552 | 2.0 | 7166 | 1.5640 | 22.5336 | 10.3837 | 18.3609 | 20.9449 | 3.2826 | 2.1341 | 1.5187 | 11.0138 | 11.3677 | 46.991 |
1.7715 | 3.0 | 10749 | 1.5354 | 23.5705 | 11.4281 | 19.7129 | 22.1588 | 3.5276 | 2.3649 | 1.7132 | 11.7397 | 11.4513 | 46.991 |
1.7385 | 4.0 | 14332 | 1.5244 | 23.8806 | 11.7122 | 20.1043 | 22.5041 | 3.5889 | 2.411 | 1.7466 | 11.8919 | 11.496 | 46.991 |
Framework versions
- Transformers 4.40.0
- Pytorch 2.2.2+cu118
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 4
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for CyrexPro/mt5-small-finetuned-cnn_dailymail
Base model
google/mt5-small