lbourdois commited on
Commit
3bb8971
·
verified ·
1 Parent(s): cda4f1e

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +72 -61
README.md CHANGED
@@ -1,62 +1,73 @@
1
- ---
2
- base_model:
3
- - qihoo360/Light-R1-32B-DS
4
- - Qwen/Qwen2.5-32B
5
- - Gen-Verse/ReasonFlux-F1
6
- - qihoo360/TinyR1-32B-Preview
7
- - Skywork/Skywork-OR1-32B-Preview
8
- - Qwen/Qwen2.5-32B-Instruct
9
- library_name: transformers
10
- tags:
11
- - mergekit
12
- - merge
13
- license: apache-2.0
14
- language:
15
- - en
16
- - zh
17
- pipeline_tag: text-generation
18
- ---
19
-
20
- *It has solved the problem of repeated generation in the previous generation.*
21
-
22
- # merge
23
-
24
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
25
-
26
- ## Merge Details
27
- ### Merge Method
28
-
29
- This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
30
-
31
- ### Models Merged
32
-
33
- The following models were included in the merge:
34
- * [qihoo360/Light-R1-32B-DS](https://huggingface.co/qihoo360/Light-R1-32B-DS)
35
- * [Gen-Verse/ReasonFlux-F1](https://huggingface.co/Gen-Verse/ReasonFlux-F1)
36
- * [qihoo360/TinyR1-32B-Preview](https://huggingface.co/qihoo360/TinyR1-32B-Preview)
37
- * [Skywork/Skywork-OR1-32B-Preview](https://huggingface.co/Skywork/Skywork-OR1-32B-Preview)
38
- * [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct)
39
-
40
- ### Configuration
41
-
42
- The following YAML configuration was used to produce this model:
43
-
44
- ```yaml
45
- merge_method: sce
46
- models:
47
- # Pivot model
48
- - model: Qwen/Qwen2.5-32B
49
- # Target models
50
- - model: qihoo360/Light-R1-32B-DS
51
- - model: qihoo360/TinyR1-32B-Preview
52
- - model: Gen-Verse/ReasonFlux-F1
53
- - model: Skywork/Skywork-OR1-32B-Preview
54
- - model: Qwen/Qwen2.5-32B-Instruct
55
- base_model: Qwen/Qwen2.5-32B
56
- parameters:
57
- select_topk: 1
58
- dtype: bfloat16
59
- tokenizer_source: qihoo360/Light-R1-32B-DS
60
- normalize: true
61
- int8_mask: true
 
 
 
 
 
 
 
 
 
 
 
62
  ```
 
1
+ ---
2
+ base_model:
3
+ - qihoo360/Light-R1-32B-DS
4
+ - Qwen/Qwen2.5-32B
5
+ - Gen-Verse/ReasonFlux-F1
6
+ - qihoo360/TinyR1-32B-Preview
7
+ - Skywork/Skywork-OR1-32B-Preview
8
+ - Qwen/Qwen2.5-32B-Instruct
9
+ library_name: transformers
10
+ tags:
11
+ - mergekit
12
+ - merge
13
+ license: apache-2.0
14
+ language:
15
+ - zho
16
+ - eng
17
+ - fra
18
+ - spa
19
+ - por
20
+ - deu
21
+ - ita
22
+ - rus
23
+ - jpn
24
+ - kor
25
+ - vie
26
+ - tha
27
+ - ara
28
+ pipeline_tag: text-generation
29
+ ---
30
+
31
+ *It has solved the problem of repeated generation in the previous generation.*
32
+
33
+ # merge
34
+
35
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
36
+
37
+ ## Merge Details
38
+ ### Merge Method
39
+
40
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) as a base.
41
+
42
+ ### Models Merged
43
+
44
+ The following models were included in the merge:
45
+ * [qihoo360/Light-R1-32B-DS](https://huggingface.co/qihoo360/Light-R1-32B-DS)
46
+ * [Gen-Verse/ReasonFlux-F1](https://huggingface.co/Gen-Verse/ReasonFlux-F1)
47
+ * [qihoo360/TinyR1-32B-Preview](https://huggingface.co/qihoo360/TinyR1-32B-Preview)
48
+ * [Skywork/Skywork-OR1-32B-Preview](https://huggingface.co/Skywork/Skywork-OR1-32B-Preview)
49
+ * [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct)
50
+
51
+ ### Configuration
52
+
53
+ The following YAML configuration was used to produce this model:
54
+
55
+ ```yaml
56
+ merge_method: sce
57
+ models:
58
+ # Pivot model
59
+ - model: Qwen/Qwen2.5-32B
60
+ # Target models
61
+ - model: qihoo360/Light-R1-32B-DS
62
+ - model: qihoo360/TinyR1-32B-Preview
63
+ - model: Gen-Verse/ReasonFlux-F1
64
+ - model: Skywork/Skywork-OR1-32B-Preview
65
+ - model: Qwen/Qwen2.5-32B-Instruct
66
+ base_model: Qwen/Qwen2.5-32B
67
+ parameters:
68
+ select_topk: 1
69
+ dtype: bfloat16
70
+ tokenizer_source: qihoo360/Light-R1-32B-DS
71
+ normalize: true
72
+ int8_mask: true
73
  ```