Whisper Large Ro - VM2

This model is a fine-tuned version of openai/whisper-large on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 3000
training_steps: 20000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
0.0978	1.2063	1000	0.1440	69.1424
0.0593	2.4125	2000	0.1349	19.0102
0.0479	3.6188	3000	0.1423	12.2152
0.0313	4.8251	4000	0.1407	8.6577
0.0115	6.0314	5000	0.1427	7.9429
0.0094	7.2376	6000	0.1444	7.4316
0.0063	8.4439	7000	0.1483	7.2524
0.0068	9.6502	8000	0.1517	7.2168
0.0047	10.8565	9000	0.1533	7.1072
0.0028	12.0627	10000	0.1560	6.6211
0.0027	13.2690	11000	0.1539	6.6994
0.0014	14.4753	12000	0.1528	6.5063
0.0012	15.6815	13000	0.1571	6.4202
0.0008	16.8878	14000	0.1592	6.4315
0.0002	18.0941	15000	0.1577	6.4602
0.0002	19.3004	16000	0.1588	6.1907
0.0001	20.5066	17000	0.1607	6.0776
0.0	21.7129	18000	0.1621	6.0402
0.0	22.9192	19000	0.1643	6.0611
0.0	24.1255	20000	0.1650	6.0193