gpt2_m030_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9188	0.0525	1000	2.4482	0.4471
1.9766	0.1050	2000	1.7972	0.5678
1.7303	0.1575	3000	1.6148	0.6000
1.609	0.2101	4000	1.5150	0.6186
1.5368	0.2626	5000	1.4535	0.6300
1.4863	0.3151	6000	1.4084	0.6384
1.4466	0.3676	7000	1.3733	0.6450
1.4134	0.4201	8000	1.3437	0.6510
1.3904	0.4726	9000	1.3217	0.6553
1.3643	0.5252	10000	1.3030	0.6589
1.3492	0.5777	11000	1.2846	0.6625
1.337	0.6302	12000	1.2720	0.6651
1.3219	0.6827	13000	1.2582	0.6679
1.3102	0.7352	14000	1.2470	0.6702
1.2984	0.7877	15000	1.2394	0.6717
1.29	0.8402	16000	1.2299	0.6738
1.2789	0.8928	17000	1.2236	0.6750
1.2752	0.9453	18000	1.2182	0.6762
1.275	0.9978	19000	1.2151	0.6770