TED_CLM_gpt2_tedlium_bigger_lr

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.0351	0.62	3000	2.2280	0.4798
1.9186	1.24	6000	2.0994	0.5074
1.88	1.86	9000	2.0577	0.5142
1.8505	2.49	12000	2.0113	0.5223
1.8284	3.11	15000	1.9957	0.5279
1.8182	3.73	18000	1.9891	0.5305
1.8061	4.35	21000	1.9617	0.5371
1.7969	4.97	24000	1.9413	0.5369
2.0383	5.59	27000	2.1697	0.4894
1.7668	6.22	30000	1.9366	0.5397
1.7556	6.84	33000	1.9303	0.5402
1.7492	7.46	36000	1.9140	0.5432
1.7409	8.08	39000	1.9088	0.5445
1.7317	8.7	42000	1.9030	0.5455
1.7218	9.32	45000	1.9040	0.5496
1.7261	9.94	48000	1.8952	0.5506
1.7175	10.57	51000	1.8959	0.5498
1.708	11.19	54000	1.8909	0.5510
1.7056	11.81	57000	1.8917	0.5518
1.6971	12.43	60000	1.8879	0.5523
1.6986	13.05	63000	1.8790	0.5532
1.6972	13.67	66000	1.8799	0.5526
1.6858	14.29	69000	1.8782	0.5543
1.6875	14.92	72000	1.8755	0.5540