mt5-small-finetuned-research-papers_summarization

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
3.1253	1.0	1000	2.7268	36.0557	17.6368	31.5759	31.5878
3.1271	2.0	2000	2.5913	35.7496	17.1898	31.363	31.3869
2.9616	3.0	3000	2.5313	36.6682	17.9617	32.3196	32.3058
2.8517	4.0	4000	2.5230	37.6535	18.6802	33.0408	33.0654
2.7771	5.0	5000	2.5006	37.9256	19.0955	33.3906	33.3775
2.7229	6.0	6000	2.4774	38.0941	19.3515	33.5769	33.5549
2.6835	7.0	7000	2.4764	38.0013	19.3891	33.57	33.5473
2.6559	8.0	8000	2.4742	37.9518	19.3443	33.5815	33.5365