Files changed (1) hide show
  1. README.md +87 -73
README.md CHANGED
@@ -1,74 +1,88 @@
1
- ---
2
- license: mit
3
- base_model:
4
- - PRIME-RL/Eurus-2-7B-PRIME
5
- - Qwen/Qwen2.5-7B-Instruct
6
- tags:
7
- - merge
8
- - mergekit
9
- - lazymergekit
10
- ---
11
-
12
- # Qwerus-7B
13
-
14
- Qwerus-7B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
15
- * [PRIME-RL/Eurus-2-7B-PRIME](https://huggingface.co/PRIME-RL/Eurus-2-7B-PRIME)
16
- * [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
17
-
18
- Benchmark on reasoning tasks using lighteval:
19
-
20
- | Task |Version| Metric |Value | |Stderr|
21
- |-----------------|------:|----------------|-----:|---|-----:|
22
- |aime24 | 1|extractive_match|0.1333|± |0.0631|
23
- |math_500| 1|extractive_match|0.7420|± |0.0196|
24
-
25
- In comparison, Qwen2.5-7B-Instruct:
26
-
27
- | Task |Version| Metric |Value | |Stderr|
28
- |-----------------|------:|----------------|-----:|---|-----:|
29
- |aime24 | 1|extractive_match|0.1667|± |0.0692|
30
- |math_500| 1|extractive_match|0.8220|± |0.0171|
31
-
32
- ## 🧩 Configuration
33
-
34
- ```yaml
35
- models:
36
- - model: Qwen/Qwen2.5-7B
37
- # No parameters necessary for base model
38
- - model: PRIME-RL/Eurus-2-7B-PRIME
39
- parameters:
40
- density: 0.56
41
- weight: 0.5
42
- - model: Qwen/Qwen2.5-7B-Instruct
43
- parameters:
44
- density: 0.56
45
- weight: 0.5
46
- merge_method: dare_ties
47
- base_model: Qwen/Qwen2.5-7B
48
- dtype: bfloat16
49
- ```
50
-
51
- ## 💻 Usage
52
-
53
- ```python
54
- !pip install -qU transformers accelerate
55
-
56
- from transformers import AutoTokenizer
57
- import transformers
58
- import torch
59
-
60
- model = "mlabonne/Qwerus-7B"
61
- messages = [{"role": "user", "content": "What is a large language model?"}]
62
-
63
- tokenizer = AutoTokenizer.from_pretrained(model)
64
- prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
65
- pipeline = transformers.pipeline(
66
- "text-generation",
67
- model=model,
68
- torch_dtype=torch.float16,
69
- device_map="auto",
70
- )
71
-
72
- outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
73
- print(outputs[0]["generated_text"])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  ```
 
1
+ ---
2
+ license: mit
3
+ base_model:
4
+ - PRIME-RL/Eurus-2-7B-PRIME
5
+ - Qwen/Qwen2.5-7B-Instruct
6
+ tags:
7
+ - merge
8
+ - mergekit
9
+ - lazymergekit
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ ---
25
+
26
+ # Qwerus-7B
27
+
28
+ Qwerus-7B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
29
+ * [PRIME-RL/Eurus-2-7B-PRIME](https://huggingface.co/PRIME-RL/Eurus-2-7B-PRIME)
30
+ * [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
31
+
32
+ Benchmark on reasoning tasks using lighteval:
33
+
34
+ | Task |Version| Metric |Value | |Stderr|
35
+ |-----------------|------:|----------------|-----:|---|-----:|
36
+ |aime24 | 1|extractive_match|0.1333|± |0.0631|
37
+ |math_500| 1|extractive_match|0.7420|± |0.0196|
38
+
39
+ In comparison, Qwen2.5-7B-Instruct:
40
+
41
+ | Task |Version| Metric |Value | |Stderr|
42
+ |-----------------|------:|----------------|-----:|---|-----:|
43
+ |aime24 | 1|extractive_match|0.1667|± |0.0692|
44
+ |math_500| 1|extractive_match|0.8220|± |0.0171|
45
+
46
+ ## 🧩 Configuration
47
+
48
+ ```yaml
49
+ models:
50
+ - model: Qwen/Qwen2.5-7B
51
+ # No parameters necessary for base model
52
+ - model: PRIME-RL/Eurus-2-7B-PRIME
53
+ parameters:
54
+ density: 0.56
55
+ weight: 0.5
56
+ - model: Qwen/Qwen2.5-7B-Instruct
57
+ parameters:
58
+ density: 0.56
59
+ weight: 0.5
60
+ merge_method: dare_ties
61
+ base_model: Qwen/Qwen2.5-7B
62
+ dtype: bfloat16
63
+ ```
64
+
65
+ ## 💻 Usage
66
+
67
+ ```python
68
+ !pip install -qU transformers accelerate
69
+
70
+ from transformers import AutoTokenizer
71
+ import transformers
72
+ import torch
73
+
74
+ model = "mlabonne/Qwerus-7B"
75
+ messages = [{"role": "user", "content": "What is a large language model?"}]
76
+
77
+ tokenizer = AutoTokenizer.from_pretrained(model)
78
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
79
+ pipeline = transformers.pipeline(
80
+ "text-generation",
81
+ model=model,
82
+ torch_dtype=torch.float16,
83
+ device_map="auto",
84
+ )
85
+
86
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
87
+ print(outputs[0]["generated_text"])
88
  ```