Update Model card
Browse files
README.md
CHANGED
@@ -12,21 +12,26 @@ model-index:
|
|
12 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
should probably proofread and complete it, then remove this comment. -->
|
14 |
|
15 |
-
#
|
16 |
|
17 |
-
This model
|
|
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
## Intended uses & limitations
|
24 |
|
25 |
-
|
|
|
26 |
|
27 |
-
## Training and evaluation data
|
28 |
-
|
29 |
-
More information needed
|
30 |
|
31 |
## Training procedure
|
32 |
|
|
|
12 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
should probably proofread and complete it, then remove this comment. -->
|
14 |
|
15 |
+
# GPT2 Instruction Tuned English To German Headline Translation Model
|
16 |
|
17 |
+
- This model makes use of a english to german news headline translation dataset derived from [Harvard/abc-news-dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SYBGZL) for the task of instruction tuning
|
18 |
+
- The dataset was derived using LLaMA3.1 and GPT4o models for generating the translations
|
19 |
+
- This model is a fine-tuned version of [raghavbali/gpt2-finetuned-headliner](https://huggingface.co/raghavbali/gpt2-finetuned-headliner).
|
20 |
|
21 |
## Model description
|
22 |
|
23 |
+
This model leverages a Stanford Alpaca style instruction tuning dataset, the format is as follows:
|
24 |
+
```md
|
25 |
+
###Translate English Text to German:{text} ###Output: {translated_text}
|
26 |
+
```
|
27 |
+
The format is slightly modified to reduce the additional tokens required for the instructions as GPT2 context size is very limited.
|
28 |
+
The model is trained on small ~5k sample to showcase the impact of instruction tuning on overall alignment of the model towards requested task
|
29 |
|
30 |
## Intended uses & limitations
|
31 |
|
32 |
+
This is only for learning purposes. The model seems to have picked up German vocabulary as well as sentence structures to a good extent but the actual translations are at time grossly incorrect.
|
33 |
+
The model also attempts at completing the news headlines given as prompt and has a high tendency to hallucinate.
|
34 |
|
|
|
|
|
|
|
35 |
|
36 |
## Training procedure
|
37 |
|