raghavbali
/

gpt2-instruct-tuned-translator2

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

raghavbali commited on Sep 14, 2024

Commit

1914103

·

verified ·

1 Parent(s): b30b9bb

Update Model card

Files changed (1) hide show

README.md +12 -7

README.md CHANGED Viewed

@@ -12,21 +12,26 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# gpt2-instruct-tuned-translator2
-This model is a fine-tuned version of [raghavbali/gpt2-finetuned-headliner](https://huggingface.co/raghavbali/gpt2-finetuned-headliner) on the None dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# GPT2 Instruction Tuned English To German Headline Translation Model
+- This model makes use of a english to german news headline translation dataset derived from [Harvard/abc-news-dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL) for the task of instruction tuning
+- The dataset was derived using LLaMA3.1 and GPT4o models for generating the translations
+- This model is a fine-tuned version of [raghavbali/gpt2-finetuned-headliner](https://huggingface.co/raghavbali/gpt2-finetuned-headliner).
 ## Model description
+This model leverages a Stanford Alpaca style instruction tuning dataset, the format is as follows:
+```md
+###Translate English Text to German:{text} ###Output: {translated_text}
+```
+The format is slightly modified to reduce the additional tokens required for the instructions as GPT2 context size is very limited.
+The model is trained on small ~5k sample to showcase the impact of instruction tuning on overall alignment of the model towards requested task
 ## Intended uses & limitations
+This is only for learning purposes. The model seems to have picked up German vocabulary as well as sentence structures to a good extent but the actual translations are at time grossly incorrect.
+The model also attempts at completing the news headlines given as prompt and has a high tendency to hallucinate.
 ## Training procedure