Files changed (1) hide show
  1. README.md +88 -75
README.md CHANGED
@@ -1,75 +1,88 @@
1
- ---
2
- base_model:
3
- - nvidia/AceMath-7B-Instruct
4
- - jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
5
- - jeffmeloy/Qwen2.5-7B-olm-v1.0
6
- - Aashraf995/Qwen-Evo-7B
7
- - Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2
8
- - Qwen/Qwen2.5-7B-Instruct
9
- - Krystalan/DRT-o1-7B
10
- library_name: transformers
11
- tags:
12
- - mergekit
13
- - merge
14
-
15
- ---
16
- # merge
17
-
18
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
19
-
20
- ## Merge Details
21
- ### Merge Method
22
-
23
- This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) as a base.
24
-
25
- ### Models Merged
26
-
27
- The following models were included in the merge:
28
- * [nvidia/AceMath-7B-Instruct](https://huggingface.co/nvidia/AceMath-7B-Instruct)
29
- * [jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0](https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0)
30
- * [jeffmeloy/Qwen2.5-7B-olm-v1.0](https://huggingface.co/jeffmeloy/Qwen2.5-7B-olm-v1.0)
31
- * [Aashraf995/Qwen-Evo-7B](https://huggingface.co/Aashraf995/Qwen-Evo-7B)
32
- * [Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2](https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2)
33
- * [Krystalan/DRT-o1-7B](https://huggingface.co/Krystalan/DRT-o1-7B)
34
-
35
- ### Configuration
36
-
37
- The following YAML configuration was used to produce this model:
38
-
39
- ```yaml
40
- models:
41
- - model: Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2 # Best for Benchmark 1 (emphasized)
42
- parameters:
43
- density: 0.2
44
- weight: 0.25 # Increased weight for more influence
45
- - model: Aashraf995/Qwen-Evo-7B # Best for Benchmark 2
46
- parameters:
47
- density: 0.15
48
- weight: 0.125
49
- - model: nvidia/AceMath-7B-Instruct # Best for Benchmark 3 (math focus)
50
- parameters:
51
- density: 0.2
52
- weight: 0.25 # Increased weight for better math performance
53
- - model: Krystalan/DRT-o1-7B # Best for Benchmark 4
54
- parameters:
55
- density: 0.15
56
- weight: 0.125
57
- - model: jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0 # Best for Benchmark 5
58
- parameters:
59
- density: 0.15
60
- weight: 0.125
61
- - model: jeffmeloy/Qwen2.5-7B-olm-v1.0 # Best for Benchmark 6
62
- parameters:
63
- density: 0.15
64
- weight: 0.125
65
-
66
- merge_method: sce
67
- base_model: Qwen/Qwen2.5-7B-Instruct # Replace if using a different base model
68
- parameters:
69
- normalize: false
70
- int8_mask: true
71
- select_topk: 0.314 # Retains 40% of high-variance elements for better performance in math and key areas
72
- dtype: bfloat16
73
- allow_crimes: true
74
-
75
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - nvidia/AceMath-7B-Instruct
4
+ - jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
5
+ - jeffmeloy/Qwen2.5-7B-olm-v1.0
6
+ - Aashraf995/Qwen-Evo-7B
7
+ - Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2
8
+ - Qwen/Qwen2.5-7B-Instruct
9
+ - Krystalan/DRT-o1-7B
10
+ library_name: transformers
11
+ tags:
12
+ - mergekit
13
+ - merge
14
+ language:
15
+ - zho
16
+ - eng
17
+ - fra
18
+ - spa
19
+ - por
20
+ - deu
21
+ - ita
22
+ - rus
23
+ - jpn
24
+ - kor
25
+ - vie
26
+ - tha
27
+ - ara
28
+ ---
29
+ # merge
30
+
31
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
32
+
33
+ ## Merge Details
34
+ ### Merge Method
35
+
36
+ This model was merged using the [SCE](https://arxiv.org/abs/2408.07990) merge method using [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) as a base.
37
+
38
+ ### Models Merged
39
+
40
+ The following models were included in the merge:
41
+ * [nvidia/AceMath-7B-Instruct](https://huggingface.co/nvidia/AceMath-7B-Instruct)
42
+ * [jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0](https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0)
43
+ * [jeffmeloy/Qwen2.5-7B-olm-v1.0](https://huggingface.co/jeffmeloy/Qwen2.5-7B-olm-v1.0)
44
+ * [Aashraf995/Qwen-Evo-7B](https://huggingface.co/Aashraf995/Qwen-Evo-7B)
45
+ * [Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2](https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2)
46
+ * [Krystalan/DRT-o1-7B](https://huggingface.co/Krystalan/DRT-o1-7B)
47
+
48
+ ### Configuration
49
+
50
+ The following YAML configuration was used to produce this model:
51
+
52
+ ```yaml
53
+ models:
54
+ - model: Goekdeniz-Guelmez/Josiefied-Qwen2.5-7B-Instruct-abliterated-v2 # Best for Benchmark 1 (emphasized)
55
+ parameters:
56
+ density: 0.2
57
+ weight: 0.25 # Increased weight for more influence
58
+ - model: Aashraf995/Qwen-Evo-7B # Best for Benchmark 2
59
+ parameters:
60
+ density: 0.15
61
+ weight: 0.125
62
+ - model: nvidia/AceMath-7B-Instruct # Best for Benchmark 3 (math focus)
63
+ parameters:
64
+ density: 0.2
65
+ weight: 0.25 # Increased weight for better math performance
66
+ - model: Krystalan/DRT-o1-7B # Best for Benchmark 4
67
+ parameters:
68
+ density: 0.15
69
+ weight: 0.125
70
+ - model: jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0 # Best for Benchmark 5
71
+ parameters:
72
+ density: 0.15
73
+ weight: 0.125
74
+ - model: jeffmeloy/Qwen2.5-7B-olm-v1.0 # Best for Benchmark 6
75
+ parameters:
76
+ density: 0.15
77
+ weight: 0.125
78
+
79
+ merge_method: sce
80
+ base_model: Qwen/Qwen2.5-7B-Instruct # Replace if using a different base model
81
+ parameters:
82
+ normalize: false
83
+ int8_mask: true
84
+ select_topk: 0.314 # Retains 40% of high-variance elements for better performance in math and key areas
85
+ dtype: bfloat16
86
+ allow_crimes: true
87
+
88
+ ```