MetaphoricalCode commited on
Commit
032ef70
·
verified ·
1 Parent(s): bb9c780

Upload 15 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ assets/ball.gif filter=lfs diff=lfs merge=lfs -text
38
+ assets/benchmark.png filter=lfs diff=lfs merge=lfs -text
39
+ assets/count.png filter=lfs diff=lfs merge=lfs -text
40
+ assets/diamond.png filter=lfs diff=lfs merge=lfs -text
41
+ assets/param-aime2024.jpeg filter=lfs diff=lfs merge=lfs -text
42
+ assets/param-lcb.jpeg filter=lfs diff=lfs merge=lfs -text
43
+ assets/writing.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
+ base_model:
6
+ - a-m-team/AM-Thinking-v1
7
+ base_model_relation: quantized
8
+ ---
9
+ ## Quantized using the default exllamav3 (0.0.4) quantization process.
10
+
11
+ - Original model: https://huggingface.co/a-m-team/AM-Thinking-v1
12
+ - exllamav3: https://github.com/turboderp-org/exllamav3
13
+ ---
14
+ # AM‑Thinking‑v1: Advancing the Frontier of Reasoning at 32B Scale
15
+ * 2025-05-10 · a-m‑team
16
+
17
+ <p align="center">
18
+ 🤗 <a href="https://huggingface.co/a-m-team">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2505.08311"> Paper</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://a-m-team.github.io/am-thinking-v1/">Blog</a> &nbsp&nbsp
19
+ </p>
20
+
21
+ ## 🚀 Introduction
22
+
23
+ We release **AM-Thinking‑v1**, a 32B dense language model focused on enhancing reasoning capabilities.
24
+ Built on Qwen 2.5‑32B‑Base, AM-Thinking‑v1 shows strong performance on reasoning benchmarks, comparable to much larger MoE models like **DeepSeek‑R1**, **Qwen3‑235B‑A22B**, **Seed1.5-Thinking**, and larger dense model like **Nemotron-Ultra-253B-v1**.
25
+
26
+ <div style="text-align: center;">
27
+ <img src="assets/benchmark.png" alt="benchmark" style="width: 90%;">
28
+ </div>
29
+
30
+
31
+ ## 🧩 Why Another 32B Reasoning Model Matters?
32
+
33
+ Large Mixture‑of‑Experts (MoE) models such as **DeepSeek‑R1** or **Qwen3‑235B‑A22B** dominate leaderboards—but they also demand clusters of high‑end GPUs. Many teams just need *the best dense model that fits on a single card*.
34
+ **AM‑Thinking‑v1** fills that gap **while remaining fully based on open-source components**:
35
+
36
+ * **Outperforms DeepSeek‑R1** on AIME’24/’25 & LiveCodeBench and **approaches Qwen3‑235B‑A22B** despite being 1/7‑th the parameter count.
37
+ * **Built on the publicly available Qwen 2.5‑32B‑Base**, as well as the RL training queries.
38
+ * Shows that with a **well‑designed post‑training pipeline** ( SFT + dual‑stage RL ) you can squeeze flagship‑level reasoning out of a 32 B dense model.
39
+ * **Deploys on one A100‑80 GB** with deterministic latency—no MoE routing overhead.
40
+
41
+ <div style="text-align: center;">
42
+ <img src="assets/param-aime2024.jpeg" alt="AIME 2024" style="width: 90%; margin-bottom: 20px;">
43
+ <img src="assets/param-lcb.jpeg" alt="LiveCodeBench" style="width: 90%;">
44
+ <div style="margin-top: 10px;">
45
+ <em>AM-Thinking-v1 achieves strong reasoning performance with significantly fewer parameters.</em>
46
+ </div>
47
+ </div>
48
+
49
+
50
+
51
+ ## 🛠️ Use Cases
52
+
53
+ ### 1) Code Generation
54
+ <pre style="font-family: 'Times New Roman', serif; font-size: 12px; border: 1px solid black; padding: 10px; font-style: italic;">
55
+ PROMPT :
56
+ write a python script for a bouncing red ball within a triangle, make sure to handle collision detection properly. make the triangle slowly rotate. implement it in python. make sure ball stays within the triangle
57
+ </pre>
58
+ <div style="text-align: center;">
59
+ <img src="assets/ball.gif" alt="Bouncing Red Ball" width="50%">
60
+ </div>
61
+
62
+
63
+ ### 2) Logic
64
+
65
+
66
+ <div style="text-align: center;">
67
+ <img src="assets/diamond.png" alt="diamond" width="90%">
68
+ </div>
69
+
70
+
71
+ ### 3) Writing
72
+ <div style="text-align: center;">
73
+ <img src="assets/writing.png" alt="sushi" width="90%">
74
+ </div>
75
+
76
+
77
+
78
+ ## ⚡ Quick start
79
+
80
+ ```python
81
+ from transformers import AutoModelForCausalLM, AutoTokenizer
82
+
83
+ model_name = "a-m-team/AM-Thinking-v1"
84
+
85
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
86
+ model = AutoModelForCausalLM.from_pretrained(
87
+ model_name,
88
+ torch_dtype="auto",
89
+ device_map="auto"
90
+ )
91
+
92
+ prompt = "How can I find inner peace?"
93
+ messages = [
94
+ {"role": "user", "content": prompt}
95
+ ]
96
+ text = tokenizer.apply_chat_template(
97
+ messages,
98
+ tokenize=False,
99
+ add_generation_prompt=True
100
+ )
101
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
102
+
103
+ generated_ids = model.generate(
104
+ **model_inputs,
105
+ max_new_tokens=49152
106
+ )
107
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
108
+
109
+ response = tokenizer.decode(output_ids, skip_special_tokens=True)
110
+ think_content = response.split("<think>")[1].split("</think>")[0]
111
+ answer_content = response.split("<answer>")[1].split("</answer>")[0]
112
+
113
+ print (f"user prompt: {prompt}")
114
+ print (f"model thinking: {think_content}")
115
+ print (f"model answer: {answer_content}")
116
+ ```
117
+ > Note: We have included the system prompt in the tokenizer configuration, as it was used during both the SFT and RL stages. To ensure consistent output quality, we recommend including the same system prompt during actual usage; otherwise, the model's responses may be significantly affected.
118
+
119
+ ### Quantized versions for compact devices
120
+ A series of quantized versions for [AM-Thinking-v1](https://huggingface.co/a-m-team/AM-Thinking-v1-gguf) model.
121
+ For use with [llama.cpp](https://github.com/ggml-org/llama.cpp) and [Ollama](https://github.com/ollama/ollama)
122
+ is available at [AM-Thinking-v1-gguf](https://huggingface.co/a-m-team/AM-Thinking-v1-gguf).
123
+
124
+
125
+ ## 🔧 Post-training pipeline
126
+
127
+ To achieve its strong reasoning ability, AM‑Thinking‑v1 goes through a carefully designed post-training pipeline.
128
+ Below we describe the key stages involved in turning a base model into a high-performing reasoner:
129
+
130
+
131
+ **Step 1 – Cold‑start SFT.**
132
+ We begin with the open-sourced **Qwen 2.5‑32B‑Base** and run a broad supervised fine‑tune on a blended training dataset of math, code and open‑domain chat. This endows the model with a "think‑then‑answer" behavioural pattern and equips it with an initial capacity for reasoning.
133
+
134
+ **Step 2 – Pass‑rate‑aware data curation.**
135
+ Before any RL, the SFT model is evaluated on every math‑ and code‑oriented training query. For each item we log a pass rate; only those with **0 < pass‑rate < 1** are kept. In effect we discard problems the model already masters and those it utterly fails, concentrating learning on genuinely informative cases.
136
+
137
+ **Step 3 – Reinforcement learning .**
138
+ We adopt a two‑stage GRPO scheme: Stage 1 trains only on math and code queries. Once it converges, stage 2 starts by removing every query the model answered 100% correctly in Stage 1 and adjusting key hyper‑parameters such as maximum generation length and learning rate.
139
+
140
+
141
+ ## ⚠️ Limitations
142
+
143
+ While AM‑Thinking‑v1 excels at pure language reasoning and open‑domain chat, it has not yet been trained for structured function‑calling or tool‑use workflows, which restricts its usefulness in agent‑style applications that must act on external systems.
144
+ Improving the model's ability to follow complex instructions is also an important direction for our future work.
145
+ In addition, our safety alignment is still at an early stage, so more rigorous red‑teaming are required to reduce potential harms.
146
+
147
+ ## 📚 Citation
148
+ The a-m-team is an internal team at Beike (Ke.com), dedicated to exploring AGI technology.
149
+ If you find our work helpful, feel free to give us a cite.
150
+
151
+ ```
152
+ @misc{ji2025amthinkingv1advancingfrontierreasoning,
153
+ title={AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale},
154
+ author={Yunjie Ji and Xiaoyu Tian and Sitong Zhao and Haotian Wang and Shuaiting Chen and Yiping Peng and Han Zhao and Xiangang Li},
155
+ year={2025},
156
+ eprint={2505.08311},
157
+ archivePrefix={arXiv},
158
+ primaryClass={cs.CL},
159
+ url={https://arxiv.org/abs/2505.08311},
160
+ }
161
+ ```
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen2ForCausalLM"
4
+ ],
5
+ "attention_dropout": 0.0,
6
+ "bos_token_id": 151643,
7
+ "eos_token_id": 151643,
8
+ "hidden_act": "silu",
9
+ "hidden_size": 5120,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 27648,
12
+ "max_position_embeddings": 131072,
13
+ "max_window_layers": 64,
14
+ "model_type": "qwen2",
15
+ "num_attention_heads": 40,
16
+ "num_hidden_layers": 64,
17
+ "num_key_value_heads": 8,
18
+ "rms_norm_eps": 1e-05,
19
+ "rope_scaling": null,
20
+ "rope_theta": 1000000.0,
21
+ "sliding_window": null,
22
+ "tie_word_embeddings": false,
23
+ "torch_dtype": "bfloat16",
24
+ "transformers_version": "4.46.0",
25
+ "use_cache": false,
26
+ "use_sliding_window": false,
27
+ "vocab_size": 152064,
28
+ "quantization_config": {
29
+ "quant_method": "exl3",
30
+ "version": "0.0.4",
31
+ "bits": 4.0,
32
+ "head_bits": 6,
33
+ "calibration": {
34
+ "rows": 100,
35
+ "cols": 2048
36
+ },
37
+ "out_scales": "auto"
38
+ }
39
+ }
generation_config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 151643,
3
+ "pad_token_id": 151643,
4
+ "eos_token_id": [
5
+ 151645,
6
+ 151643
7
+ ],
8
+ "temperature": 0.6,
9
+ "top_p": 0.95,
10
+ "repetition_penalty": 1.0
11
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94db585972f95b6a03d186fece2c9b6b5c9f720f952db68a7197f6ef1f574ab6
3
+ size 8391760800
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:048d2bb43d10d89a8558206ec36216df1520aaa19e15f1500783d069d6c08b66
3
+ size 8543277872
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1046d9465889a03c71b9904c227b8d900add1d3762d13e76df91ba780faed523
3
+ size 828344304
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
quantization_config.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c5ae00e602b8860cbd784ba82a8aa14e8feecec692e7076590d014d7b7fdafa
3
+ size 11421896
tokenizer_config.json ADDED
@@ -0,0 +1,208 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "151643": {
6
+ "content": "<|endoftext|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "151644": {
14
+ "content": "<|im_start|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "151645": {
22
+ "content": "<|im_end|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "151646": {
30
+ "content": "<|object_ref_start|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "151647": {
38
+ "content": "<|object_ref_end|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "151648": {
46
+ "content": "<|box_start|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "151649": {
54
+ "content": "<|box_end|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "151650": {
62
+ "content": "<|quad_start|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "151651": {
70
+ "content": "<|quad_end|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "151652": {
78
+ "content": "<|vision_start|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "151653": {
86
+ "content": "<|vision_end|>",
87
+ "lstrip": false,
88
+ "normalized": false,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ },
93
+ "151654": {
94
+ "content": "<|vision_pad|>",
95
+ "lstrip": false,
96
+ "normalized": false,
97
+ "rstrip": false,
98
+ "single_word": false,
99
+ "special": true
100
+ },
101
+ "151655": {
102
+ "content": "<|image_pad|>",
103
+ "lstrip": false,
104
+ "normalized": false,
105
+ "rstrip": false,
106
+ "single_word": false,
107
+ "special": true
108
+ },
109
+ "151656": {
110
+ "content": "<|video_pad|>",
111
+ "lstrip": false,
112
+ "normalized": false,
113
+ "rstrip": false,
114
+ "single_word": false,
115
+ "special": true
116
+ },
117
+ "151657": {
118
+ "content": "<tool_call>",
119
+ "lstrip": false,
120
+ "normalized": false,
121
+ "rstrip": false,
122
+ "single_word": false,
123
+ "special": false
124
+ },
125
+ "151658": {
126
+ "content": "</tool_call>",
127
+ "lstrip": false,
128
+ "normalized": false,
129
+ "rstrip": false,
130
+ "single_word": false,
131
+ "special": false
132
+ },
133
+ "151659": {
134
+ "content": "<|fim_prefix|>",
135
+ "lstrip": false,
136
+ "normalized": false,
137
+ "rstrip": false,
138
+ "single_word": false,
139
+ "special": false
140
+ },
141
+ "151660": {
142
+ "content": "<|fim_middle|>",
143
+ "lstrip": false,
144
+ "normalized": false,
145
+ "rstrip": false,
146
+ "single_word": false,
147
+ "special": false
148
+ },
149
+ "151661": {
150
+ "content": "<|fim_suffix|>",
151
+ "lstrip": false,
152
+ "normalized": false,
153
+ "rstrip": false,
154
+ "single_word": false,
155
+ "special": false
156
+ },
157
+ "151662": {
158
+ "content": "<|fim_pad|>",
159
+ "lstrip": false,
160
+ "normalized": false,
161
+ "rstrip": false,
162
+ "single_word": false,
163
+ "special": false
164
+ },
165
+ "151663": {
166
+ "content": "<|repo_name|>",
167
+ "lstrip": false,
168
+ "normalized": false,
169
+ "rstrip": false,
170
+ "single_word": false,
171
+ "special": false
172
+ },
173
+ "151664": {
174
+ "content": "<|file_sep|>",
175
+ "lstrip": false,
176
+ "normalized": false,
177
+ "rstrip": false,
178
+ "single_word": false,
179
+ "special": false
180
+ }
181
+ },
182
+ "additional_special_tokens": [
183
+ "<|im_start|>",
184
+ "<|im_end|>",
185
+ "<|object_ref_start|>",
186
+ "<|object_ref_end|>",
187
+ "<|box_start|>",
188
+ "<|box_end|>",
189
+ "<|quad_start|>",
190
+ "<|quad_end|>",
191
+ "<|vision_start|>",
192
+ "<|vision_end|>",
193
+ "<|vision_pad|>",
194
+ "<|image_pad|>",
195
+ "<|video_pad|>"
196
+ ],
197
+ "bos_token": null,
198
+ "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0]['role'] == 'system' %}\n {{- messages[0]['content'] }}\n {%- else %}\n {{- 'You are a helpful assistant. To answer the user\\'s question, you first think about the reasoning process and then provide the user with the answer. The reasoning process and answer are enclosed within <think> </think> and <answer> </answer> tags, respectively, i.e., <think> reasoning process here </think> <answer> answer here </answer>.' }}\n {%- endif %}\n {{- \"\\n\\n# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0]['role'] == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0]['content'] + '<|im_end|>\\n' }}\n {%- else %}\n {{- '<|im_start|>system\\nYou are a helpful assistant. To answer the user\\'s question, you first think about the reasoning process and then provide the user with the answer. The reasoning process and answer are enclosed within <think> </think> and <answer> </answer> tags, respectively, i.e., <think> reasoning process here </think> <answer> answer here </answer>.<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) or (message.role == \"assistant\" and not message.tool_calls) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role }}\n {%- if message.content %}\n {{- '\\n' + message.content }}\n {%- endif %}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '\\n<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {{- tool_call.arguments | tojson }}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n",
199
+ "clean_up_tokenization_spaces": false,
200
+ "eos_token": "<|im_end|>",
201
+ "errors": "replace",
202
+ "model_max_length": 131072,
203
+ "pad_token": "<|endoftext|>",
204
+ "padding_side": "right",
205
+ "split_special_tokens": false,
206
+ "tokenizer_class": "Qwen2Tokenizer",
207
+ "unk_token": null
208
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff