gpt2_u100_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.8305	0.0505	1000	2.3790	0.4650
1.9272	0.1009	2000	1.7501	0.5809
1.6793	0.1514	3000	1.5689	0.6136
1.5605	0.2018	4000	1.4699	0.6316
1.4891	0.2523	5000	1.4112	0.6422
1.4391	0.3028	6000	1.3646	0.6514
1.3995	0.3532	7000	1.3317	0.6575
1.3707	0.4037	8000	1.3021	0.6631
1.3424	0.4541	9000	1.2806	0.6675
1.3242	0.5046	10000	1.2613	0.6714
1.3058	0.5551	11000	1.2435	0.6748
1.2888	0.6055	12000	1.2291	0.6777
1.2748	0.6560	13000	1.2178	0.6801
1.2654	0.7064	14000	1.2077	0.6821
1.2549	0.7569	15000	1.1964	0.6843
1.2459	0.8073	16000	1.1878	0.6860
1.2385	0.8578	17000	1.1819	0.6873
1.2286	0.9083	18000	1.1756	0.6886
1.2246	0.9587	19000	1.1708	0.6896