lbourdois commited on
Commit
60ac96a
·
verified ·
1 Parent(s): 22b3a57

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +246 -234
README.md CHANGED
@@ -1,235 +1,247 @@
1
- ---
2
- license: other
3
- license_name: qwen
4
- license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
5
- datasets:
6
- - Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
7
- base_model:
8
- - Qwen/Qwen2.5-7B-Instruct
9
- library_name: transformers
10
- tags:
11
- - generated_from_trainer
12
- language:
13
- - en
14
- model-index:
15
- - name: cybertron-v4-qw7B-MGS
16
- results:
17
- - task:
18
- type: text-generation
19
- name: Text Generation
20
- dataset:
21
- name: IFEval (0-Shot)
22
- type: HuggingFaceH4/ifeval
23
- args:
24
- num_few_shot: 0
25
- metrics:
26
- - type: inst_level_strict_acc and prompt_level_strict_acc
27
- value: 62.64
28
- name: strict accuracy
29
- source:
30
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
31
- name: Open LLM Leaderboard
32
- - task:
33
- type: text-generation
34
- name: Text Generation
35
- dataset:
36
- name: BBH (3-Shot)
37
- type: BBH
38
- args:
39
- num_few_shot: 3
40
- metrics:
41
- - type: acc_norm
42
- value: 37.04
43
- name: normalized accuracy
44
- source:
45
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
46
- name: Open LLM Leaderboard
47
- - task:
48
- type: text-generation
49
- name: Text Generation
50
- dataset:
51
- name: MATH Lvl 5 (4-Shot)
52
- type: hendrycks/competition_math
53
- args:
54
- num_few_shot: 4
55
- metrics:
56
- - type: exact_match
57
- value: 27.72
58
- name: exact match
59
- source:
60
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
61
- name: Open LLM Leaderboard
62
- - task:
63
- type: text-generation
64
- name: Text Generation
65
- dataset:
66
- name: GPQA (0-shot)
67
- type: Idavidrein/gpqa
68
- args:
69
- num_few_shot: 0
70
- metrics:
71
- - type: acc_norm
72
- value: 8.05
73
- name: acc_norm
74
- source:
75
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
76
- name: Open LLM Leaderboard
77
- - task:
78
- type: text-generation
79
- name: Text Generation
80
- dataset:
81
- name: MuSR (0-shot)
82
- type: TAUR-Lab/MuSR
83
- args:
84
- num_few_shot: 0
85
- metrics:
86
- - type: acc_norm
87
- value: 13.2
88
- name: acc_norm
89
- source:
90
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
91
- name: Open LLM Leaderboard
92
- - task:
93
- type: text-generation
94
- name: Text Generation
95
- dataset:
96
- name: MMLU-PRO (5-shot)
97
- type: TIGER-Lab/MMLU-Pro
98
- config: main
99
- split: test
100
- args:
101
- num_few_shot: 5
102
- metrics:
103
- - type: acc
104
- value: 38.59
105
- name: accuracy
106
- source:
107
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
108
- name: Open LLM Leaderboard
109
- ---
110
-
111
- # cybertron-v4-qw7B-MGS
112
-
113
- **WE ARE BACK** Cybertron v4, #1 LLM in its class. Based on the amazing Qwen2.5 7B
114
-
115
- **Scoring #1 LLM of 7B and 8B at 30.10.2024.**
116
-
117
- ![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
118
-
119
- Here we use our novel approach called `MGS`. Its up to you to figure out what it means.
120
-
121
- Cybertron V4 went thru SFT over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1`
122
-
123
- ## Quantz
124
- Avaialble at https://huggingface.co/bartowski/cybertron-v4-qw7B-MGS-GGUF
125
-
126
- ## MGS
127
- Being fair:
128
-
129
- https://arxiv.org/pdf/2410.21228
130
-
131
- MGS, among other things.. a strategy of tackling corpora forgetful.
132
-
133
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
134
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/fblgit__cybertron-v4-qw7B-MGS-details)
135
-
136
- | Metric |Value|
137
- |-------------------|----:|
138
- |Avg. |31.21|
139
- |IFEval (0-Shot) |62.64|
140
- |BBH (3-Shot) |37.04|
141
- |MATH Lvl 5 (4-Shot)|27.72|
142
- |GPQA (0-shot) | 8.05|
143
- |MuSR (0-shot) |13.20|
144
- |MMLU-PRO (5-shot) |38.59|
145
-
146
- ## Try Cybertron v4!
147
-
148
- Thanks to @rombodawg for contributing with a free to use Inference space hosted at:
149
-
150
- https://huggingface.co/spaces/rombodawg/Try_fblgit_cybertron-v4-qw7B-MGS
151
-
152
- ## Training procedure
153
- 1 Epoch as usual.
154
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
155
-
156
- ### Training hyperparameters
157
-
158
- The following hyperparameters were used during training:
159
- - seed: 42
160
- - distributed_type: multi-GPU
161
- - num_devices: 8
162
- - total_train_batch_size: 128
163
- - total_eval_batch_size: 16
164
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
165
- - num_epochs: 1
166
-
167
- ### Training results
168
-
169
- | Training Loss | Epoch | Step | Validation Loss |
170
- |:-------------:|:------:|:----:|:---------------:|
171
- | 0.7405 | 0.0007 | 1 | 0.5760 |
172
- | 0.6146 | 0.0502 | 71 | 0.5045 |
173
- | 0.5908 | 0.1003 | 142 | 0.4930 |
174
- | 0.5669 | 0.1505 | 213 | 0.4854 |
175
- | 0.5575 | 0.2007 | 284 | 0.4811 |
176
- | 0.535 | 0.2508 | 355 | 0.4765 |
177
- | 0.5161 | 0.3010 | 426 | 0.4736 |
178
- | 0.5268 | 0.3511 | 497 | 0.4726 |
179
- | 0.5119 | 0.4013 | 568 | 0.4701 |
180
- | 0.5329 | 0.4515 | 639 | 0.4687 |
181
- | 0.5167 | 0.5016 | 710 | 0.4673 |
182
- | 0.5105 | 0.5518 | 781 | 0.4660 |
183
- | 0.5203 | 0.6020 | 852 | 0.4653 |
184
- | 0.5035 | 0.6521 | 923 | 0.4646 |
185
- | 0.4903 | 0.7023 | 994 | 0.4641 |
186
- | 0.5031 | 0.7525 | 1065 | 0.4628 |
187
- | 0.5147 | 0.8026 | 1136 | 0.4629 |
188
- | 0.5037 | 0.8528 | 1207 | 0.4620 |
189
- | 0.5029 | 0.9029 | 1278 | 0.4620 |
190
- | 0.492 | 0.9531 | 1349 | 0.4621 |
191
-
192
-
193
- ### Framework versions
194
-
195
- - PEFT 0.13.2
196
- - Transformers 4.45.2
197
- - Pytorch 2.3.0+cu121
198
- - Datasets 3.0.1
199
- - Tokenizers 0.20.1
200
-
201
- ## Citations
202
- ```
203
- @misc{thebeagle-v2,
204
- title={TheBeagle v2: MGS},
205
- author={Xavier Murias},
206
- year={2024},
207
- publisher = {HuggingFace},
208
- journal = {HuggingFace repository},
209
- howpublished = {\url{https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS}},
210
- }
211
-
212
- @misc{Magpie,
213
- title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
214
- author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
215
- year={2024},
216
- eprint={2406.08464},
217
- archivePrefix={arXiv},
218
- primaryClass={cs.CL}
219
- }
220
-
221
- @misc{qwen2.5,
222
- title = {Qwen2.5: A Party of Foundation Models},
223
- url = {https://qwenlm.github.io/blog/qwen2.5/},
224
- author = {Qwen Team},
225
- month = {September},
226
- year = {2024}
227
- }
228
-
229
- @article{qwen2,
230
- title={Qwen2 Technical Report},
231
- author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
232
- journal={arXiv preprint arXiv:2407.10671},
233
- year={2024}
234
- }
 
 
 
 
 
 
 
 
 
 
 
 
235
  ```
 
1
+ ---
2
+ license: other
3
+ license_name: qwen
4
+ license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
5
+ datasets:
6
+ - Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
7
+ base_model:
8
+ - Qwen/Qwen2.5-7B-Instruct
9
+ library_name: transformers
10
+ tags:
11
+ - generated_from_trainer
12
+ language:
13
+ - zho
14
+ - eng
15
+ - fra
16
+ - spa
17
+ - por
18
+ - deu
19
+ - ita
20
+ - rus
21
+ - jpn
22
+ - kor
23
+ - vie
24
+ - tha
25
+ - ara
26
+ model-index:
27
+ - name: cybertron-v4-qw7B-MGS
28
+ results:
29
+ - task:
30
+ type: text-generation
31
+ name: Text Generation
32
+ dataset:
33
+ name: IFEval (0-Shot)
34
+ type: HuggingFaceH4/ifeval
35
+ args:
36
+ num_few_shot: 0
37
+ metrics:
38
+ - type: inst_level_strict_acc and prompt_level_strict_acc
39
+ value: 62.64
40
+ name: strict accuracy
41
+ source:
42
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
43
+ name: Open LLM Leaderboard
44
+ - task:
45
+ type: text-generation
46
+ name: Text Generation
47
+ dataset:
48
+ name: BBH (3-Shot)
49
+ type: BBH
50
+ args:
51
+ num_few_shot: 3
52
+ metrics:
53
+ - type: acc_norm
54
+ value: 37.04
55
+ name: normalized accuracy
56
+ source:
57
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
58
+ name: Open LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: MATH Lvl 5 (4-Shot)
64
+ type: hendrycks/competition_math
65
+ args:
66
+ num_few_shot: 4
67
+ metrics:
68
+ - type: exact_match
69
+ value: 27.72
70
+ name: exact match
71
+ source:
72
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
73
+ name: Open LLM Leaderboard
74
+ - task:
75
+ type: text-generation
76
+ name: Text Generation
77
+ dataset:
78
+ name: GPQA (0-shot)
79
+ type: Idavidrein/gpqa
80
+ args:
81
+ num_few_shot: 0
82
+ metrics:
83
+ - type: acc_norm
84
+ value: 8.05
85
+ name: acc_norm
86
+ source:
87
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: MuSR (0-shot)
94
+ type: TAUR-Lab/MuSR
95
+ args:
96
+ num_few_shot: 0
97
+ metrics:
98
+ - type: acc_norm
99
+ value: 13.2
100
+ name: acc_norm
101
+ source:
102
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
103
+ name: Open LLM Leaderboard
104
+ - task:
105
+ type: text-generation
106
+ name: Text Generation
107
+ dataset:
108
+ name: MMLU-PRO (5-shot)
109
+ type: TIGER-Lab/MMLU-Pro
110
+ config: main
111
+ split: test
112
+ args:
113
+ num_few_shot: 5
114
+ metrics:
115
+ - type: acc
116
+ value: 38.59
117
+ name: accuracy
118
+ source:
119
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=fblgit/cybertron-v4-qw7B-MGS
120
+ name: Open LLM Leaderboard
121
+ ---
122
+
123
+ # cybertron-v4-qw7B-MGS
124
+
125
+ **WE ARE BACK** Cybertron v4, #1 LLM in its class. Based on the amazing Qwen2.5 7B
126
+
127
+ **Scoring #1 LLM of 7B and 8B at 30.10.2024.**
128
+
129
+ ![cybertron-v4-MGS](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS/resolve/main/cybertron_v4MGS.png)
130
+
131
+ Here we use our novel approach called `MGS`. Its up to you to figure out what it means.
132
+
133
+ Cybertron V4 went thru SFT over `Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1`
134
+
135
+ ## Quantz
136
+ Avaialble at https://huggingface.co/bartowski/cybertron-v4-qw7B-MGS-GGUF
137
+
138
+ ## MGS
139
+ Being fair:
140
+
141
+ https://arxiv.org/pdf/2410.21228
142
+
143
+ MGS, among other things.. a strategy of tackling corpora forgetful.
144
+
145
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
146
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/fblgit__cybertron-v4-qw7B-MGS-details)
147
+
148
+ | Metric |Value|
149
+ |-------------------|----:|
150
+ |Avg. |31.21|
151
+ |IFEval (0-Shot) |62.64|
152
+ |BBH (3-Shot) |37.04|
153
+ |MATH Lvl 5 (4-Shot)|27.72|
154
+ |GPQA (0-shot) | 8.05|
155
+ |MuSR (0-shot) |13.20|
156
+ |MMLU-PRO (5-shot) |38.59|
157
+
158
+ ## Try Cybertron v4!
159
+
160
+ Thanks to @rombodawg for contributing with a free to use Inference space hosted at:
161
+
162
+ https://huggingface.co/spaces/rombodawg/Try_fblgit_cybertron-v4-qw7B-MGS
163
+
164
+ ## Training procedure
165
+ 1 Epoch as usual.
166
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
167
+
168
+ ### Training hyperparameters
169
+
170
+ The following hyperparameters were used during training:
171
+ - seed: 42
172
+ - distributed_type: multi-GPU
173
+ - num_devices: 8
174
+ - total_train_batch_size: 128
175
+ - total_eval_batch_size: 16
176
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
177
+ - num_epochs: 1
178
+
179
+ ### Training results
180
+
181
+ | Training Loss | Epoch | Step | Validation Loss |
182
+ |:-------------:|:------:|:----:|:---------------:|
183
+ | 0.7405 | 0.0007 | 1 | 0.5760 |
184
+ | 0.6146 | 0.0502 | 71 | 0.5045 |
185
+ | 0.5908 | 0.1003 | 142 | 0.4930 |
186
+ | 0.5669 | 0.1505 | 213 | 0.4854 |
187
+ | 0.5575 | 0.2007 | 284 | 0.4811 |
188
+ | 0.535 | 0.2508 | 355 | 0.4765 |
189
+ | 0.5161 | 0.3010 | 426 | 0.4736 |
190
+ | 0.5268 | 0.3511 | 497 | 0.4726 |
191
+ | 0.5119 | 0.4013 | 568 | 0.4701 |
192
+ | 0.5329 | 0.4515 | 639 | 0.4687 |
193
+ | 0.5167 | 0.5016 | 710 | 0.4673 |
194
+ | 0.5105 | 0.5518 | 781 | 0.4660 |
195
+ | 0.5203 | 0.6020 | 852 | 0.4653 |
196
+ | 0.5035 | 0.6521 | 923 | 0.4646 |
197
+ | 0.4903 | 0.7023 | 994 | 0.4641 |
198
+ | 0.5031 | 0.7525 | 1065 | 0.4628 |
199
+ | 0.5147 | 0.8026 | 1136 | 0.4629 |
200
+ | 0.5037 | 0.8528 | 1207 | 0.4620 |
201
+ | 0.5029 | 0.9029 | 1278 | 0.4620 |
202
+ | 0.492 | 0.9531 | 1349 | 0.4621 |
203
+
204
+
205
+ ### Framework versions
206
+
207
+ - PEFT 0.13.2
208
+ - Transformers 4.45.2
209
+ - Pytorch 2.3.0+cu121
210
+ - Datasets 3.0.1
211
+ - Tokenizers 0.20.1
212
+
213
+ ## Citations
214
+ ```
215
+ @misc{thebeagle-v2,
216
+ title={TheBeagle v2: MGS},
217
+ author={Xavier Murias},
218
+ year={2024},
219
+ publisher = {HuggingFace},
220
+ journal = {HuggingFace repository},
221
+ howpublished = {\url{https://huggingface.co/fblgit/TheBeagle-v2beta-32B-MGS}},
222
+ }
223
+
224
+ @misc{Magpie,
225
+ title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
226
+ author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
227
+ year={2024},
228
+ eprint={2406.08464},
229
+ archivePrefix={arXiv},
230
+ primaryClass={cs.CL}
231
+ }
232
+
233
+ @misc{qwen2.5,
234
+ title = {Qwen2.5: A Party of Foundation Models},
235
+ url = {https://qwenlm.github.io/blog/qwen2.5/},
236
+ author = {Qwen Team},
237
+ month = {September},
238
+ year = {2024}
239
+ }
240
+
241
+ @article{qwen2,
242
+ title={Qwen2 Technical Report},
243
+ author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
244
+ journal={arXiv preprint arXiv:2407.10671},
245
+ year={2024}
246
+ }
247
  ```