raghavbali commited on
Commit
1914103
·
verified ·
1 Parent(s): b30b9bb

Update Model card

Browse files
Files changed (1) hide show
  1. README.md +12 -7
README.md CHANGED
@@ -12,21 +12,26 @@ model-index:
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
- # gpt2-instruct-tuned-translator2
16
 
17
- This model is a fine-tuned version of [raghavbali/gpt2-finetuned-headliner](https://huggingface.co/raghavbali/gpt2-finetuned-headliner) on the None dataset.
 
 
18
 
19
  ## Model description
20
 
21
- More information needed
 
 
 
 
 
22
 
23
  ## Intended uses & limitations
24
 
25
- More information needed
 
26
 
27
- ## Training and evaluation data
28
-
29
- More information needed
30
 
31
  ## Training procedure
32
 
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
  should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # GPT2 Instruction Tuned English To German Headline Translation Model
16
 
17
+ - This model makes use of a english to german news headline translation dataset derived from [Harvard/abc-news-dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL) for the task of instruction tuning
18
+ - The dataset was derived using LLaMA3.1 and GPT4o models for generating the translations
19
+ - This model is a fine-tuned version of [raghavbali/gpt2-finetuned-headliner](https://huggingface.co/raghavbali/gpt2-finetuned-headliner).
20
 
21
  ## Model description
22
 
23
+ This model leverages a Stanford Alpaca style instruction tuning dataset, the format is as follows:
24
+ ```md
25
+ ###Translate English Text to German:{text} ###Output: {translated_text}
26
+ ```
27
+ The format is slightly modified to reduce the additional tokens required for the instructions as GPT2 context size is very limited.
28
+ The model is trained on small ~5k sample to showcase the impact of instruction tuning on overall alignment of the model towards requested task
29
 
30
  ## Intended uses & limitations
31
 
32
+ This is only for learning purposes. The model seems to have picked up German vocabulary as well as sentence structures to a good extent but the actual translations are at time grossly incorrect.
33
+ The model also attempts at completing the news headlines given as prompt and has a high tendency to hallucinate.
34
 
 
 
 
35
 
36
  ## Training procedure
37