gpt2_m000_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9158	0.0524	1000	2.4497	0.4472
1.9767	0.1048	2000	1.7961	0.5687
1.7276	0.1572	3000	1.6160	0.6008
1.6067	0.2095	4000	1.5124	0.6198
1.5333	0.2619	5000	1.4490	0.6316
1.4851	0.3143	6000	1.4045	0.6401
1.4409	0.3667	7000	1.3679	0.6469
1.4136	0.4191	8000	1.3405	0.6521
1.3862	0.4715	9000	1.3192	0.6562
1.3654	0.5238	10000	1.3000	0.6600
1.3468	0.5762	11000	1.2802	0.6640
1.3298	0.6286	12000	1.2670	0.6667
1.3187	0.6810	13000	1.2545	0.6692
1.3017	0.7334	14000	1.2441	0.6714
1.2955	0.7858	15000	1.2334	0.6736
1.2826	0.8381	16000	1.2252	0.6753
1.2773	0.8905	17000	1.2187	0.6766
1.2729	0.9429	18000	1.2132	0.6779
1.2672	0.9953	19000	1.2101	0.6786