--- license: mit tags: - generated_from_keras_callback model-index: - name: MariaZafar/gpt2-finetuned-wikitext2 results: [] --- # MariaZafar/gpt2-finetuned-wikitext2 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset. It achieves the following results on the evaluation set: - Train Loss: 0.7785 - Validation Loss: 3.7004 - Epoch: 49 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01} - training_precision: float32 ### Training results | Train Loss | Validation Loss | Epoch | |:----------:|:---------------:|:-----:| | 5.8858 | 7.5655 | 0 | | 4.0619 | 5.8193 | 1 | | 3.3766 | 4.9585 | 2 | | 3.0686 | 4.5764 | 3 | | 2.9022 | 4.3847 | 4 | | 2.7838 | 4.2249 | 5 | | 2.6997 | 4.1060 | 6 | | 2.6154 | 4.0100 | 7 | | 2.5575 | 3.9412 | 8 | | 2.4933 | 3.8447 | 9 | | 2.4397 | 3.7619 | 10 | | 2.3835 | 3.7510 | 11 | | 2.3403 | 3.6810 | 12 | | 2.2924 | 3.6716 | 13 | | 2.2513 | 3.6335 | 14 | | 2.2031 | 3.6208 | 15 | | 2.1619 | 3.5915 | 16 | | 2.1234 | 3.5497 | 17 | | 2.0792 | 3.5540 | 18 | | 2.0398 | 3.5461 | 19 | | 1.9976 | 3.5282 | 20 | | 1.9577 | 3.5260 | 21 | | 1.9176 | 3.5041 | 22 | | 1.8745 | 3.4994 | 23 | | 1.8304 | 3.5250 | 24 | | 1.7881 | 3.4864 | 25 | | 1.7423 | 3.4718 | 26 | | 1.6993 | 3.5194 | 27 | | 1.6503 | 3.5019 | 28 | | 1.6025 | 3.5055 | 29 | | 1.5500 | 3.5109 | 30 | | 1.4964 | 3.5389 | 31 | | 1.4448 | 3.5393 | 32 | | 1.3954 | 3.5363 | 33 | | 1.3464 | 3.5446 | 34 | | 1.2978 | 3.5117 | 35 | | 1.2494 | 3.5225 | 36 | | 1.2004 | 3.5443 | 37 | | 1.1534 | 3.5909 | 38 | | 1.1124 | 3.5380 | 39 | | 1.0709 | 3.6162 | 40 | | 1.0265 | 3.6758 | 41 | | 0.9936 | 3.6168 | 42 | | 0.9590 | 3.6243 | 43 | | 0.9238 | 3.6308 | 44 | | 0.8886 | 3.6429 | 45 | | 0.8635 | 3.7137 | 46 | | 0.8352 | 3.6512 | 47 | | 0.8050 | 3.7033 | 48 | | 0.7785 | 3.7004 | 49 | ### Framework versions - Transformers 4.19.2 - TensorFlow 2.8.0 - Datasets 2.2.1 - Tokenizers 0.12.1