gpt2_u030_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.92	0.0523	1000	2.4566	0.4470
1.9818	0.1046	2000	1.8089	0.5672
1.7321	0.1570	3000	1.6133	0.6021
1.6137	0.2093	4000	1.5171	0.6195
1.5353	0.2616	5000	1.4516	0.6312
1.4845	0.3139	6000	1.4056	0.6400
1.4443	0.3662	7000	1.3718	0.6466
1.4118	0.4186	8000	1.3420	0.6525
1.3878	0.4709	9000	1.3189	0.6569
1.3661	0.5232	10000	1.2988	0.6608
1.3485	0.5755	11000	1.2834	0.6639
1.3326	0.6278	12000	1.2675	0.6669
1.319	0.6802	13000	1.2555	0.6694
1.3068	0.7325	14000	1.2440	0.6719
1.2932	0.7848	15000	1.2350	0.6737
1.2868	0.8371	16000	1.2263	0.6755
1.2791	0.8894	17000	1.2193	0.6771
1.2725	0.9418	18000	1.2141	0.6780
1.2711	0.9941	19000	1.2108	0.6788