lbourdois commited on
Commit
6429cf7
·
verified ·
1 Parent(s): e329e14

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +119 -106
README.md CHANGED
@@ -1,106 +1,119 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-14B
4
- - Krystalan/DRT-o1-14B
5
- - netease-youdao/Confucius-o1-14B
6
- - huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated
7
- - djuna/Q2.5-Veltha-14B-0.5
8
- library_name: transformers
9
- tags:
10
- - mergekit
11
- - merge
12
-
13
- ---
14
- # merge
15
-
16
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
17
-
18
- ## Merge Details
19
- ### Merge Method
20
-
21
- This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
22
-
23
- ### Models Merged
24
-
25
- The following models were included in the merge:
26
- * [Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B)
27
- * [netease-youdao/Confucius-o1-14B](https://huggingface.co/netease-youdao/Confucius-o1-14B)
28
- * [huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated)
29
- * [djuna/Q2.5-Veltha-14B-0.5](https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5)
30
-
31
- ### Configuration
32
-
33
- The following YAML configuration was used to produce this model:
34
-
35
- ```yaml
36
- models:
37
- - model: Qwen/Qwen2.5-14B
38
- - model: netease-youdao/Confucius-o1-14B
39
- - model: djuna/Q2.5-Veltha-14B-0.5
40
- - model: Krystalan/DRT-o1-14B
41
- - model: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated
42
- merge_method: sce
43
- base_model: Qwen/Qwen2.5-14B
44
- tokenizer:
45
- source: "union"
46
- tokens:
47
- <|endoftext|>:
48
- source: "djuna/Q2.5-Veltha-14B-0.5"
49
- <|im_start|>:
50
- source: "djuna/Q2.5-Veltha-14B-0.5"
51
- <|im_end|>:
52
- source: "djuna/Q2.5-Veltha-14B-0.5"
53
- <|object_ref_start|>:
54
- source: "djuna/Q2.5-Veltha-14B-0.5"
55
- <|object_ref_end|>:
56
- source: "djuna/Q2.5-Veltha-14B-0.5"
57
- <|box_start|>:
58
- source: "djuna/Q2.5-Veltha-14B-0.5"
59
- <|box_end|>:
60
- source: "djuna/Q2.5-Veltha-14B-0.5"
61
- <|end▁of▁sentence|>:
62
- source:
63
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
64
- kind: "model_token"
65
- token: "<|end▁of▁sentence|>"
66
- force: true
67
- <|User|>:
68
- source:
69
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
70
- kind: "model_token"
71
- token: "<|User|>"
72
- force: true
73
- <|Assistant|>:
74
- source:
75
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
76
- kind: "model_token"
77
- token: "<|Assistant|>"
78
- force: true
79
- <|begin▁of▁sentence|>:
80
- source:
81
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
82
- kind: "model_token"
83
- token: "<|begin▁of▁sentence|>"
84
- force: true
85
- <|EOT|>:
86
- source:
87
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
88
- kind: "model_token"
89
- token: "<|EOT|>"
90
- force: true
91
- <think>:
92
- source:
93
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
94
- kind: "model_token"
95
- token: "<think>"
96
- force: true
97
- </think>:
98
- source:
99
- model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
100
- kind: "model_token"
101
- token: "</think>"
102
- force: true
103
- dtype: float32
104
- out_dtype: bfloat16
105
-
106
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-14B
4
+ - Krystalan/DRT-o1-14B
5
+ - netease-youdao/Confucius-o1-14B
6
+ - huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated
7
+ - djuna/Q2.5-Veltha-14B-0.5
8
+ library_name: transformers
9
+ tags:
10
+ - mergekit
11
+ - merge
12
+ language:
13
+ - zho
14
+ - eng
15
+ - fra
16
+ - spa
17
+ - por
18
+ - deu
19
+ - ita
20
+ - rus
21
+ - jpn
22
+ - kor
23
+ - vie
24
+ - tha
25
+ - ara
26
+ ---
27
+ # merge
28
+
29
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
30
+
31
+ ## Merge Details
32
+ ### Merge Method
33
+
34
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
35
+
36
+ ### Models Merged
37
+
38
+ The following models were included in the merge:
39
+ * [Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B)
40
+ * [netease-youdao/Confucius-o1-14B](https://huggingface.co/netease-youdao/Confucius-o1-14B)
41
+ * [huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated)
42
+ * [djuna/Q2.5-Veltha-14B-0.5](https://huggingface.co/djuna/Q2.5-Veltha-14B-0.5)
43
+
44
+ ### Configuration
45
+
46
+ The following YAML configuration was used to produce this model:
47
+
48
+ ```yaml
49
+ models:
50
+ - model: Qwen/Qwen2.5-14B
51
+ - model: netease-youdao/Confucius-o1-14B
52
+ - model: djuna/Q2.5-Veltha-14B-0.5
53
+ - model: Krystalan/DRT-o1-14B
54
+ - model: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated
55
+ merge_method: sce
56
+ base_model: Qwen/Qwen2.5-14B
57
+ tokenizer:
58
+ source: "union"
59
+ tokens:
60
+ <|endoftext|>:
61
+ source: "djuna/Q2.5-Veltha-14B-0.5"
62
+ <|im_start|>:
63
+ source: "djuna/Q2.5-Veltha-14B-0.5"
64
+ <|im_end|>:
65
+ source: "djuna/Q2.5-Veltha-14B-0.5"
66
+ <|object_ref_start|>:
67
+ source: "djuna/Q2.5-Veltha-14B-0.5"
68
+ <|object_ref_end|>:
69
+ source: "djuna/Q2.5-Veltha-14B-0.5"
70
+ <|box_start|>:
71
+ source: "djuna/Q2.5-Veltha-14B-0.5"
72
+ <|box_end|>:
73
+ source: "djuna/Q2.5-Veltha-14B-0.5"
74
+ <|end▁of▁sentence|>:
75
+ source:
76
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
77
+ kind: "model_token"
78
+ token: "<|end▁of▁sentence|>"
79
+ force: true
80
+ <|User|>:
81
+ source:
82
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
83
+ kind: "model_token"
84
+ token: "<|User|>"
85
+ force: true
86
+ <|Assistant|>:
87
+ source:
88
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
89
+ kind: "model_token"
90
+ token: "<|Assistant|>"
91
+ force: true
92
+ <|begin▁of▁sentence|>:
93
+ source:
94
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
95
+ kind: "model_token"
96
+ token: "<|begin▁of▁sentence|>"
97
+ force: true
98
+ <|EOT|>:
99
+ source:
100
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
101
+ kind: "model_token"
102
+ token: "<|EOT|>"
103
+ force: true
104
+ <think>:
105
+ source:
106
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
107
+ kind: "model_token"
108
+ token: "<think>"
109
+ force: true
110
+ </think>:
111
+ source:
112
+ model: "huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated"
113
+ kind: "model_token"
114
+ token: "</think>"
115
+ force: true
116
+ dtype: float32
117
+ out_dtype: bfloat16
118
+
119
+ ```