jasperyeoh2 commited on
Commit
257d07d
·
verified ·
1 Parent(s): 47d8334

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -18
README.md CHANGED
@@ -7,6 +7,10 @@ datasets:
7
  language:
8
  - en
9
  - th
 
 
 
 
10
  ---
11
 
12
 
@@ -18,7 +22,6 @@ language:
18
  <!-- Provide a longer summary of what this model is. -->
19
 
20
 
21
-
22
  - **Developed by:** [Jixin Yang @ HKUST]
23
  - **Model type:** [PEFT (LoRA) fine-tuned LLaMA-2 7B for backward text generation]
24
  - **Finetuned from model [optional]:** [meta-llama/Llama-2-7b-hf]
@@ -26,14 +29,7 @@ language:
26
 
27
 
28
  ## Uses
29
-
30
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
31
-
32
- ### Direct Use
33
-
34
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
35
-
36
- [This model is designed for backward text generation - given an output text, it generates the corresponding input.]
37
 
38
 
39
 
@@ -41,7 +37,7 @@ language:
41
 
42
  Use the code below to get started with the model.
43
 
44
- [from transformers import AutoModelForCausalLM, AutoTokenizer
45
 
46
  model_name = "jasperyeoh2/llama2-7b-backward-model"
47
  tokenizer = AutoTokenizer.from_pretrained(model_name)
@@ -50,7 +46,7 @@ model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
50
  input_text = "Output text to reverse"
51
  inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
52
  outputs = model.generate(**inputs, max_new_tokens=50)
53
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))]
54
 
55
  ## Training Details
56
 
@@ -58,9 +54,9 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))]
58
 
59
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
60
 
61
- [- Dataset: [OpenAssistant-Guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco)
62
  - Number of examples used: ~3,200
63
- - Task: Instruction Backtranslation (Answer → Prompt)]
64
 
65
  ### Training Procedure
66
 
@@ -68,7 +64,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))]
68
 
69
  #### Preprocessing [optional]
70
 
71
- [- Method: PEFT with LoRA (Low-Rank Adaptation)
72
  - Quantization: 4-bit (NF4)
73
  - LoRA config:
74
  - `r`: 8
@@ -83,7 +79,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))]
83
  - Learning rate: 2e-5
84
  - Scheduler: linear with warmup
85
  - Optimizer: AdamW
86
- - Early stopping: enabled (patience=2)]
87
 
88
 
89
  #### Metrics
@@ -102,8 +98,6 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))]
102
 
103
  ### Compute Infrastructure
104
 
105
- [#### Hardware
106
-
107
  - GPU: 1× NVIDIA A800 (80GB)
108
  - CUDA Version: 12.1
109
 
@@ -118,7 +112,7 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))]
118
 
119
  #### Hardware
120
 
121
- [NVIDIA A800 GPU]
122
 
123
 
124
  ### Framework versions
 
7
  language:
8
  - en
9
  - th
10
+ - zh
11
+ metrics:
12
+ - accuracy
13
+ pipeline_tag: question-answering
14
  ---
15
 
16
 
 
22
  <!-- Provide a longer summary of what this model is. -->
23
 
24
 
 
25
  - **Developed by:** [Jixin Yang @ HKUST]
26
  - **Model type:** [PEFT (LoRA) fine-tuned LLaMA-2 7B for backward text generation]
27
  - **Finetuned from model [optional]:** [meta-llama/Llama-2-7b-hf]
 
29
 
30
 
31
  ## Uses
32
+ This model is designed for backward text generation - given an output text, it generates the corresponding input.
 
 
 
 
 
 
 
33
 
34
 
35
 
 
37
 
38
  Use the code below to get started with the model.
39
 
40
+ from transformers import AutoModelForCausalLM, AutoTokenizer
41
 
42
  model_name = "jasperyeoh2/llama2-7b-backward-model"
43
  tokenizer = AutoTokenizer.from_pretrained(model_name)
 
46
  input_text = "Output text to reverse"
47
  inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
48
  outputs = model.generate(**inputs, max_new_tokens=50)
49
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
50
 
51
  ## Training Details
52
 
 
54
 
55
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
56
 
57
+ - Dataset: [OpenAssistant-Guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco)
58
  - Number of examples used: ~3,200
59
+ - Task: Instruction Backtranslation (Answer → Prompt)
60
 
61
  ### Training Procedure
62
 
 
64
 
65
  #### Preprocessing [optional]
66
 
67
+ - Method: PEFT with LoRA (Low-Rank Adaptation)
68
  - Quantization: 4-bit (NF4)
69
  - LoRA config:
70
  - `r`: 8
 
79
  - Learning rate: 2e-5
80
  - Scheduler: linear with warmup
81
  - Optimizer: AdamW
82
+ - Early stopping: enabled (patience=2)
83
 
84
 
85
  #### Metrics
 
98
 
99
  ### Compute Infrastructure
100
 
 
 
101
  - GPU: 1× NVIDIA A800 (80GB)
102
  - CUDA Version: 12.1
103
 
 
112
 
113
  #### Hardware
114
 
115
+ NVIDIA A800 GPU
116
 
117
 
118
  ### Framework versions