lbourdois commited on
Commit
cfe370d
·
verified ·
1 Parent(s): 0e02dc3

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +65 -52
README.md CHANGED
@@ -1,52 +1,65 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-14B-Instruct
4
- - Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
5
- - Qwen/Qwen2.5-14B-Instruct-1M
6
- - suayptalha/Lamarckvergence-14B
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
- ---
13
- # merge
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- ## Merge Details
18
- ### Merge Method
19
-
20
- This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base.
21
-
22
- ### Models Merged
23
-
24
- The following models were included in the merge:
25
- * [Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1](https://huggingface.co/Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1)
26
- * [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
27
- * [suayptalha/Lamarckvergence-14B](https://huggingface.co/suayptalha/Lamarckvergence-14B)
28
-
29
- ### Configuration
30
-
31
- The following YAML configuration was used to produce this model:
32
-
33
- ```yaml
34
- merge_method: sce
35
- base_model: Qwen/Qwen2.5-14B-Instruct
36
- models:
37
- - model: Qwen/Qwen2.5-14B-Instruct-1M
38
- parameters:
39
- weight: 0.23
40
- - model: Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
41
- parameters:
42
- weight: 0.39
43
- - model: suayptalha/Lamarckvergence-14B
44
- parameters:
45
- weight: 0.38
46
- parameters:
47
- density: 0.4
48
- select_topk: 0.2
49
- normalize: true
50
- dtype: bfloat16
51
- tokenizer_source: Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
52
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-14B-Instruct
4
+ - Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
5
+ - Qwen/Qwen2.5-14B-Instruct-1M
6
+ - suayptalha/Lamarckvergence-14B
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ language:
12
+ - zho
13
+ - eng
14
+ - fra
15
+ - spa
16
+ - por
17
+ - deu
18
+ - ita
19
+ - rus
20
+ - jpn
21
+ - kor
22
+ - vie
23
+ - tha
24
+ - ara
25
+ ---
26
+ # merge
27
+
28
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
29
+
30
+ ## Merge Details
31
+ ### Merge Method
32
+
33
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base.
34
+
35
+ ### Models Merged
36
+
37
+ The following models were included in the merge:
38
+ * [Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1](https://huggingface.co/Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1)
39
+ * [Qwen/Qwen2.5-14B-Instruct-1M](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct-1M)
40
+ * [suayptalha/Lamarckvergence-14B](https://huggingface.co/suayptalha/Lamarckvergence-14B)
41
+
42
+ ### Configuration
43
+
44
+ The following YAML configuration was used to produce this model:
45
+
46
+ ```yaml
47
+ merge_method: sce
48
+ base_model: Qwen/Qwen2.5-14B-Instruct
49
+ models:
50
+ - model: Qwen/Qwen2.5-14B-Instruct-1M
51
+ parameters:
52
+ weight: 0.23
53
+ - model: Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
54
+ parameters:
55
+ weight: 0.39
56
+ - model: suayptalha/Lamarckvergence-14B
57
+ parameters:
58
+ weight: 0.38
59
+ parameters:
60
+ density: 0.4
61
+ select_topk: 0.2
62
+ normalize: true
63
+ dtype: bfloat16
64
+ tokenizer_source: Triangle104/DS-R1-Distill-Q2.5-14B-Harmony_V0.1
65
+ ```