lbourdois commited on
Commit
c72516a
·
verified ·
1 Parent(s): 7860008

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +130 -118
README.md CHANGED
@@ -1,119 +1,131 @@
1
- ---
2
- language:
3
- - en
4
- license: other
5
- license_name: qwen
6
- license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
7
- library_name: transformers
8
- tags:
9
- - generated_from_trainer
10
- base_model: Qwen/Qwen2.5-1.5B-Instruct
11
- model-index:
12
- - name: miniclaus-qw1.5B-UNAMGS
13
- results: []
14
- datasets:
15
- - Magpie-Align/Magpie-Pro-MT-300K-v0.1
16
- ---
17
-
18
- # miniclaus-qw1.5B-UNAMGS
19
-
20
- Trained with `Magpie-Align/Magpie-Pro-MT-300K-v0.1`
21
-
22
- Using MGS & UNA (MLP) on this tiny but powerful model.
23
-
24
- ![miniclaus-qw1.5B-UNAMGS](https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS/resolve/main/miniclaus_qw15-UNAMGS.png)
25
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
26
-
27
- It achieves the following results on the evaluation set:
28
- - Loss: 0.7193
29
-
30
- ## Quants
31
- Available at:
32
- * https://huggingface.co/bartowski/miniclaus-qw1.5B-UNAMGS-GGUF
33
- * https://huggingface.co/QuantFactory/miniclaus-qw1.5B-UNAMGS-GGUF
34
-
35
- ## Training procedure
36
-
37
- ### Training hyperparameters
38
-
39
- The following hyperparameters were used during training:
40
- - train_batch_size: 1
41
- - seed: 42
42
- - distributed_type: multi-GPU
43
- - num_devices: 8
44
- - total_train_batch_size: 128
45
- - total_eval_batch_size: 8
46
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
- - num_epochs: 1
48
-
49
- ### Training results
50
-
51
- | Training Loss | Epoch | Step | Validation Loss |
52
- |:-------------:|:------:|:----:|:---------------:|
53
- | 1.1641 | 0.0007 | 1 | 0.8514 |
54
- | 0.9246 | 0.0503 | 76 | 0.7921 |
55
- | 0.8791 | 0.1006 | 152 | 0.7727 |
56
- | 0.8507 | 0.1509 | 228 | 0.7611 |
57
- | 0.8376 | 0.2012 | 304 | 0.7534 |
58
- | 0.793 | 0.2515 | 380 | 0.7467 |
59
- | 0.7834 | 0.3018 | 456 | 0.7421 |
60
- | 0.7807 | 0.3521 | 532 | 0.7384 |
61
- | 0.764 | 0.4023 | 608 | 0.7359 |
62
- | 0.7738 | 0.4526 | 684 | 0.7320 |
63
- | 0.7425 | 0.5029 | 760 | 0.7300 |
64
- | 0.7519 | 0.5532 | 836 | 0.7279 |
65
- | 0.7461 | 0.6035 | 912 | 0.7255 |
66
- | 0.7489 | 0.6538 | 988 | 0.7245 |
67
- | 0.7614 | 0.7041 | 1064 | 0.7222 |
68
- | 0.7576 | 0.7544 | 1140 | 0.7222 |
69
- | 0.7303 | 0.8047 | 1216 | 0.7209 |
70
- | 0.7332 | 0.8550 | 1292 | 0.7199 |
71
- | 0.7541 | 0.9053 | 1368 | 0.7202 |
72
- | 0.7369 | 0.9556 | 1444 | 0.7193 |
73
-
74
-
75
- ### Framework versions
76
-
77
- - PEFT 0.13.2
78
- - Transformers 4.45.2
79
- - Pytorch 2.3.0+cu121
80
- - Datasets 3.0.1
81
- - Tokenizers 0.20.1
82
-
83
- ## Thanks
84
- - Qwen Team for their outstanding model
85
- - MagPie Team for contributing plenty of datasets
86
- - Cybertron Cloud Compute
87
-
88
- ## Citations
89
- ```
90
- @misc{miniclaus-qw15,
91
- title={MiniClaus: 1.5B UNAMGS},
92
- author={Xavier Murias},
93
- year={2024},
94
- publisher = {HuggingFace},
95
- journal = {HuggingFace repository},
96
- howpublished = {\url{https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS}},
97
- }
98
- @misc{Magpie,
99
- title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
100
- author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
101
- year={2024},
102
- eprint={2406.08464},
103
- archivePrefix={arXiv},
104
- primaryClass={cs.CL}
105
- }
106
- @misc{qwen2.5,
107
- title = {Qwen2.5: A Party of Foundation Models},
108
- url = {https://qwenlm.github.io/blog/qwen2.5/},
109
- author = {Qwen Team},
110
- month = {September},
111
- year = {2024}
112
- }
113
- @article{qwen2,
114
- title={Qwen2 Technical Report},
115
- author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
116
- journal={arXiv preprint arXiv:2407.10671},
117
- year={2024}
118
- }
 
 
 
 
 
 
 
 
 
 
 
 
119
  ```
 
1
+ ---
2
+ language:
3
+ - zho
4
+ - eng
5
+ - fra
6
+ - spa
7
+ - por
8
+ - deu
9
+ - ita
10
+ - rus
11
+ - jpn
12
+ - kor
13
+ - vie
14
+ - tha
15
+ - ara
16
+ license: other
17
+ license_name: qwen
18
+ license_link: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/blob/main/LICENSE
19
+ library_name: transformers
20
+ tags:
21
+ - generated_from_trainer
22
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
23
+ datasets:
24
+ - Magpie-Align/Magpie-Pro-MT-300K-v0.1
25
+ model-index:
26
+ - name: miniclaus-qw1.5B-UNAMGS
27
+ results: []
28
+ ---
29
+
30
+ # miniclaus-qw1.5B-UNAMGS
31
+
32
+ Trained with `Magpie-Align/Magpie-Pro-MT-300K-v0.1`
33
+
34
+ Using MGS & UNA (MLP) on this tiny but powerful model.
35
+
36
+ ![miniclaus-qw1.5B-UNAMGS](https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS/resolve/main/miniclaus_qw15-UNAMGS.png)
37
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
38
+
39
+ It achieves the following results on the evaluation set:
40
+ - Loss: 0.7193
41
+
42
+ ## Quants
43
+ Available at:
44
+ * https://huggingface.co/bartowski/miniclaus-qw1.5B-UNAMGS-GGUF
45
+ * https://huggingface.co/QuantFactory/miniclaus-qw1.5B-UNAMGS-GGUF
46
+
47
+ ## Training procedure
48
+
49
+ ### Training hyperparameters
50
+
51
+ The following hyperparameters were used during training:
52
+ - train_batch_size: 1
53
+ - seed: 42
54
+ - distributed_type: multi-GPU
55
+ - num_devices: 8
56
+ - total_train_batch_size: 128
57
+ - total_eval_batch_size: 8
58
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
59
+ - num_epochs: 1
60
+
61
+ ### Training results
62
+
63
+ | Training Loss | Epoch | Step | Validation Loss |
64
+ |:-------------:|:------:|:----:|:---------------:|
65
+ | 1.1641 | 0.0007 | 1 | 0.8514 |
66
+ | 0.9246 | 0.0503 | 76 | 0.7921 |
67
+ | 0.8791 | 0.1006 | 152 | 0.7727 |
68
+ | 0.8507 | 0.1509 | 228 | 0.7611 |
69
+ | 0.8376 | 0.2012 | 304 | 0.7534 |
70
+ | 0.793 | 0.2515 | 380 | 0.7467 |
71
+ | 0.7834 | 0.3018 | 456 | 0.7421 |
72
+ | 0.7807 | 0.3521 | 532 | 0.7384 |
73
+ | 0.764 | 0.4023 | 608 | 0.7359 |
74
+ | 0.7738 | 0.4526 | 684 | 0.7320 |
75
+ | 0.7425 | 0.5029 | 760 | 0.7300 |
76
+ | 0.7519 | 0.5532 | 836 | 0.7279 |
77
+ | 0.7461 | 0.6035 | 912 | 0.7255 |
78
+ | 0.7489 | 0.6538 | 988 | 0.7245 |
79
+ | 0.7614 | 0.7041 | 1064 | 0.7222 |
80
+ | 0.7576 | 0.7544 | 1140 | 0.7222 |
81
+ | 0.7303 | 0.8047 | 1216 | 0.7209 |
82
+ | 0.7332 | 0.8550 | 1292 | 0.7199 |
83
+ | 0.7541 | 0.9053 | 1368 | 0.7202 |
84
+ | 0.7369 | 0.9556 | 1444 | 0.7193 |
85
+
86
+
87
+ ### Framework versions
88
+
89
+ - PEFT 0.13.2
90
+ - Transformers 4.45.2
91
+ - Pytorch 2.3.0+cu121
92
+ - Datasets 3.0.1
93
+ - Tokenizers 0.20.1
94
+
95
+ ## Thanks
96
+ - Qwen Team for their outstanding model
97
+ - MagPie Team for contributing plenty of datasets
98
+ - Cybertron Cloud Compute
99
+
100
+ ## Citations
101
+ ```
102
+ @misc{miniclaus-qw15,
103
+ title={MiniClaus: 1.5B UNAMGS},
104
+ author={Xavier Murias},
105
+ year={2024},
106
+ publisher = {HuggingFace},
107
+ journal = {HuggingFace repository},
108
+ howpublished = {\url{https://huggingface.co/fblgit/miniclaus-qw1.5B-UNAMGS}},
109
+ }
110
+ @misc{Magpie,
111
+ title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing},
112
+ author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin},
113
+ year={2024},
114
+ eprint={2406.08464},
115
+ archivePrefix={arXiv},
116
+ primaryClass={cs.CL}
117
+ }
118
+ @misc{qwen2.5,
119
+ title = {Qwen2.5: A Party of Foundation Models},
120
+ url = {https://qwenlm.github.io/blog/qwen2.5/},
121
+ author = {Qwen Team},
122
+ month = {September},
123
+ year = {2024}
124
+ }
125
+ @article{qwen2,
126
+ title={Qwen2 Technical Report},
127
+ author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
128
+ journal={arXiv preprint arXiv:2407.10671},
129
+ year={2024}
130
+ }
131
  ```