SFTvit5-large_sum-10k_23Feb-2025

This model is a fine-tuned version of VietAI/vit5-large-vietnews-summarization on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
No log	0.9978	168	0.5023	0.2519	0.1711	0.2119	0.2118	19.0
No log	1.9955	336	0.4599	0.254	0.1739	0.2142	0.2141	19.0
No log	2.9993	505	0.4481	0.2554	0.1767	0.2166	0.2165	19.0
No log	3.9970	673	0.4485	0.2561	0.1771	0.2161	0.216	19.0
No log	4.9948	841	0.4508	0.2544	0.176	0.2154	0.2153	19.0
1.139	5.9985	1010	0.4544	0.2548	0.1767	0.2158	0.2158	19.0
1.139	6.9963	1178	0.4562	0.256	0.1789	0.2171	0.217	19.0
1.139	8.0	1347	0.4632	0.2558	0.1776	0.2164	0.2163	19.0
1.139	8.9978	1515	0.4668	0.2555	0.1773	0.2163	0.2163	19.0
1.139	9.9777	1680	0.4682	0.2555	0.177	0.216	0.2159	19.0