BaViT5_v2

This model is a fine-tuned version of VietAI/vit5-large on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 15
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Sacrebleu
0.5323	1.0	2966	0.4843	10.8807
0.4426	2.0	5932	0.4266	13.2481
0.3629	3.0	8898	0.4084	14.2709
0.3321	4.0	11864	0.4032	14.8016
0.286	5.0	14830	0.4061	15.1102
0.2528	6.0	17796	0.4160	15.2808
0.2235	7.0	20762	0.4270	15.4345
0.2018	8.0	23728	0.4400	15.4360
0.1856	9.0	26694	0.4562	15.4902
0.1639	10.0	29660	0.4705	15.4167
0.1565	11.0	32626	0.4886	15.4478
0.1392	12.0	35592	0.5035	15.4189