de_mlm_child_13_new

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.2096	2000	7.0534
7.005	2.4191	4000	5.9491
7.005	3.6287	6000	5.8401
5.5533	4.8382	8000	5.7121
5.5533	6.0478	10000	5.6198
5.3687	7.2573	12000	5.5387
5.3687	8.4669	14000	5.4706
5.2269	9.6764	16000	5.4217
5.2269	10.8860	18000	5.3631
5.1229	12.0956	20000	5.3213
5.1229	13.3051	22000	5.2743
5.0061	14.5147	24000	4.9732
5.0061	15.7242	26000	4.2903
4.0671	16.9338	28000	3.7955
4.0671	18.1433	30000	3.5115
3.3253	19.3529	32000	3.3390
3.3253	20.5624	34000	3.1444
2.969	21.7720	36000	2.9896
2.969	22.9816	38000	2.8436
2.6828	24.1911	40000	2.7259
2.6828	25.4007	42000	2.6287
2.4634	26.6102	44000	2.5300
2.4634	27.8198	46000	2.4772
2.317	29.0293	48000	2.4356
2.317	30.2389	50000	2.3599
2.2136	31.4484	52000	2.3320
2.2136	32.6580	54000	2.3017
2.1362	33.8676	56000	2.2597
2.1362	35.0771	58000	2.2461
2.0753	36.2867	60000	2.2101
2.0753	37.4962	62000	2.1714
2.0214	38.7058	64000	2.1669
2.0214	39.9153	66000	2.1505
1.9757	41.1249	68000	2.1361
1.9757	42.3344	70000	2.0909
1.9433	43.5440	72000	2.0859
1.9433	44.7536	74000	2.0732
1.9082	45.9631	76000	2.0745
1.9082	47.1727	78000	2.0481
1.8821	48.3822	80000	2.0327
1.8821	49.5918	82000	2.0211
1.8581	50.8013	84000	2.0263
1.8581	52.0109	86000	2.0064
1.8359	53.2204	88000	2.0034
1.8359	54.4300	90000	1.9978
1.8159	55.6396	92000	1.9832
1.8159	56.8491	94000	1.9772
1.8051	58.0587	96000	1.9930
1.8051	59.2682	98000	1.9752
1.794	60.4778	100000	1.9684