whisper-large-v3-turbo-maltese-malta

This model is a fine-tuned version of openai/whisper-large-v3-turbo on the fleurs dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.06
num_epochs: 3
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Wer Ortho	Wer
0.8035	0.1170	32	0.8262	0.0067	59.6043	39.5151
0.4214	0.2340	64	0.6413	0.0067	48.4838	25.9491
0.3364	0.3510	96	0.5759	0.0067	45.7317	22.1842
0.2865	0.4680	128	0.5396	0.0067	44.8798	21.5635
0.2636	0.5850	160	0.5267	0.0067	43.1983	19.7733
0.2484	0.7020	192	0.4909	0.0067	42.6321	18.4914
0.2494	0.8190	224	0.4828	0.0067	42.1669	18.3834
0.2348	0.9360	256	0.4829	0.0067	42.0380	17.7132
0.1985	1.0512	288	0.4681	0.0067	40.7040	17.4883
0.1884	1.1682	320	0.4705	0.0067	41.2757	16.9800
0.1878	1.2852	352	0.4599	0.0067	40.6928	16.3908
0.1856	1.4022	384	0.4641	0.0067	41.2869	16.7461
0.1873	1.5192	416	0.4506	0.0067	41.2813	15.8645
0.1795	1.6362	448	0.4634	0.0067	40.5190	16.5122
0.1802	1.7532	480	0.4758	0.0067	40.5358	16.0175
0.1782	1.8702	512	0.4512	0.0067	40.2388	15.7566
0.1785	1.9872	544	0.4510	0.0067	40.0706	15.4777
0.1503	2.1024	576	0.4492	0.0067	39.3756	15.0144
0.1507	2.2194	608	0.4624	0.0067	39.8016	15.3787
0.1545	2.3364	640	0.4448	0.0067	39.1346	14.5151
0.1431	2.4534	672	0.4501	0.0067	39.7007	14.9604
0.1453	2.5704	704	0.4495	0.0067	39.3980	15.0504
0.1435	2.6874	736	0.4595	0.0067	39.9361	14.9019
0.1416	2.8044	768	0.4583	0.0067	39.9249	14.9604
0.1459	2.9214	800	0.4549	0.0067	39.9809	15.0999