speecht5_finetuned_english_ranil_aug2

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 20
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.5568	1.0	48	0.6822
0.4527	2.0	96	0.6500
0.4343	3.0	144	0.6412
0.4038	4.0	192	0.6339
0.4056	5.0	240	0.6388
0.3966	6.0	288	0.6324
0.3889	7.0	336	0.6302
0.3853	8.0	384	0.6484
0.3744	9.0	432	0.6202
0.3699	10.0	480	0.6162
0.3716	11.0	528	0.6161
0.365	12.0	576	0.6149
0.3631	13.0	624	0.6110
0.3597	14.0	672	0.6109
0.3597	15.0	720	0.6112
0.3547	16.0	768	0.6050
0.353	17.0	816	0.6034
0.348	18.0	864	0.6015
0.3449	19.0	912	0.5975
0.3432	20.0	960	0.5983
0.3436	21.0	1008	0.6019
0.3409	22.0	1056	0.6016
0.3379	23.0	1104	0.5985
0.3357	24.0	1152	0.5970
0.3316	25.0	1200	0.5948
0.3338	26.0	1248	0.5991
0.3336	27.0	1296	0.5936
0.3317	28.0	1344	0.5867
0.3293	29.0	1392	0.5885
0.3288	30.0	1440	0.5884
0.3289	31.0	1488	0.5892
0.3242	32.0	1536	0.5892
0.3253	33.0	1584	0.5860
0.3261	34.0	1632	0.5860
0.3253	35.0	1680	0.5857
0.3229	36.0	1728	0.5863
0.3226	37.0	1776	0.5858
0.3219	38.0	1824	0.5899
0.3186	39.0	1872	0.5855
0.3268	39.1684	1880	0.5833