MTF-ta-en-translation

This model is a fine-tuned version of parambharat/whisper-small-ta on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 5000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Bleu Score
0.0939	2.9412	250	0.0864	0.0302
0.0227	5.8824	500	0.0998	0.0301
0.0048	8.8235	750	0.1083	0.0340
0.001	11.7647	1000	0.1132	0.0312
0.0005	14.7059	1250	0.1164	0.0308
0.0003	17.6471	1500	0.1189	0.0322
0.0002	20.5882	1750	0.1208	0.0311
0.0002	23.5294	2000	0.1225	0.0307
0.0002	26.4706	2250	0.1242	0.0334
0.0001	29.4118	2500	0.1256	0.0321
0.0001	32.3529	2750	0.1268	0.0327
0.0001	35.2941	3000	0.1277	0.0324
0.0001	38.2353	3250	0.1286	0.0311
0.0001	41.1765	3500	0.1295	0.0309
0.0001	44.1176	3750	0.1302	0.0311
0.0001	47.0588	4000	0.1308	0.0310
0.0001	50.0	4250	0.1314	0.0313
0.0001	52.9412	4500	0.1320	0.0298
0.0001	55.8824	4750	0.1323	0.0299
0.0	58.8235	5000	0.1324	0.0299