LG-AI-EXAONE commited on
Commit
dc500f0
·
1 Parent(s): 0806c70

Update llama.cpp support

Browse files
Files changed (2) hide show
  1. README.md +4 -7
  2. chat_template.jinja +146 -0
README.md CHANGED
@@ -20,6 +20,7 @@ library_name: transformers
20
  <img src="assets/EXAONE_Symbol+BI_3d.png", width="300", style="margin: 40 auto;">
21
  🎉 License Updated! We are pleased to announce our more flexible licensing terms 🤗
22
  <br>✈️ Try on <a href="https://friendli.ai/suite/~/serverless-endpoints/LGAI-EXAONE/EXAONE-4.0-32B/overview">FriendliAI</a>
 
23
  <br>
24
 
25
  # EXAONE-4.0-1.2B-GGUF
@@ -53,11 +54,7 @@ For more details, please refer to our [technical report](https://arxiv.org/abs/2
53
  ### llama.cpp
54
  You can run EXAONE models locally using llama.cpp by following these steps:
55
 
56
- 1. Install the latest version of llama.cpp, by cloning the our PR and building from source. Please refer to the official documentation about [building from source](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md).
57
-
58
- ```bash
59
- git clone --single-branch -b add-exaone4 https://github.com/lgai-exaone/llama.cpp.git
60
- ```
61
 
62
  2. Download the EXAONE 4.0 model weights in GGUF format.
63
 
@@ -108,12 +105,12 @@ git clone --single-branch -b add-exaone4 https://github.com/lgai-exaone/llama.cp
108
  <details>
109
  <summary>OpenAI compatible server with `llama-server`</summary>
110
 
111
- 3. Run llama-server with EXAONE 4.0 Jinja template.
112
  ```bash
113
  llama-server -m EXAONE-4.0-32B-Q4_K_M.gguf \
114
  -c 131072 -fa -ngl 64 \
115
  --temp 0.6 --top-p 0.95 \
116
- --jinja --chat-template-format chat_template_simple.jinja \
117
  --host 0.0.0.0 --port 8820 \
118
  -a EXAONE-4.0-32B-Q4_K_M
119
  ```
 
20
  <img src="assets/EXAONE_Symbol+BI_3d.png", width="300", style="margin: 40 auto;">
21
  🎉 License Updated! We are pleased to announce our more flexible licensing terms 🤗
22
  <br>✈️ Try on <a href="https://friendli.ai/suite/~/serverless-endpoints/LGAI-EXAONE/EXAONE-4.0-32B/overview">FriendliAI</a>
23
+ <br><br><i>📢 EXAONE 4.0 is officially supported by llama.cpp! Please check the guide <a href="#quickstart-gguf">below</a></i>
24
  <br>
25
 
26
  # EXAONE-4.0-1.2B-GGUF
 
54
  ### llama.cpp
55
  You can run EXAONE models locally using llama.cpp by following these steps:
56
 
57
+ 1. Install the latest version of llama.cpp (version >= `b5932`). Please check the official [installation guide](https://github.com/ggml-org/llama.cpp?tab=readme-ov-file#quick-start) from llama.cpp.
 
 
 
 
58
 
59
  2. Download the EXAONE 4.0 model weights in GGUF format.
60
 
 
105
  <details>
106
  <summary>OpenAI compatible server with `llama-server`</summary>
107
 
108
+ 3. Run llama-server with EXAONE 4.0 Jinja template. You can find the [chat template file](https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-1.2B-GGUF/blob/main/chat_template.jinja) in this repository.
109
  ```bash
110
  llama-server -m EXAONE-4.0-32B-Q4_K_M.gguf \
111
  -c 131072 -fa -ngl 64 \
112
  --temp 0.6 --top-p 0.95 \
113
+ --jinja --chat-template-format chat_template.jinja \
114
  --host 0.0.0.0 --port 8820 \
115
  -a EXAONE-4.0-32B-Q4_K_M
116
  ```
chat_template.jinja ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if not skip_think is defined %}
2
+ {%- set skip_think = true %}
3
+ {%- endif %}
4
+
5
+ {%- set role_indicators = {
6
+ 'user': '[|user|]\n',
7
+ 'assistant': '[|assistant|]\n',
8
+ 'system': '[|system|]\n',
9
+ 'tool': '[|tool|]\n'
10
+ } %}
11
+ {%- set end_of_turn = '[|endofturn|]\n' %}
12
+
13
+
14
+ {%- macro available_tools(tools) %}
15
+ {{- "# Available Tools" }}
16
+ {{- "\nYou can use none, one, or multiple of the following tools by calling them as functions to help with the user’s query." }}
17
+ {{- "\nHere are the tools available to you in JSON format within <tool> and </tool> tags:\n" }}
18
+ {%- for tool in tools %}
19
+ {{- "<tool>" }}
20
+ {{- tool | tojson(ensure_ascii=False) | safe }}
21
+ {{- "</tool>\n" }}
22
+ {%- endfor %}
23
+
24
+ {{- "\nFor each function call you want to make, return a JSON object with function name and arguments within <tool_call> and </tool_call> tags, like:" }}
25
+ {{- "\n<tool_call>{\"name\": function_1_name, \"arguments\": {argument_1_name: argument_1_value, argument_2_name: argument_2_value}}</tool_call>" }}
26
+ {{- "\n<tool_call>{\"name\": function_2_name, \"arguments\": {...}}</tool_call>\n..." }}
27
+ {{- "\nNote that if no argument name is specified for a tool, you can just print the argument value directly, without the argument name or JSON formatting." }}
28
+ {%- endmacro %}
29
+
30
+
31
+ {%- set ns = namespace(last_query_index = messages|length - 1) %}
32
+ {%- for message in messages %}
33
+ {%- if message.role == "user" and message.content is string %}
34
+ {%- set ns.last_query_index = loop.index0 -%}
35
+ {%- endif %}
36
+ {%- endfor %}
37
+
38
+ {%- for i in range(messages | length) %}
39
+ {%- set msg = messages[i] %}
40
+ {%- set role = msg.role %}
41
+ {% if role is not none and role.class is not none and role not in role_indicators %}
42
+ {{- raise_exception('Unknown role: ' ~ role) }}
43
+ {%- endif %}
44
+
45
+ {%- if i == 0 %}
46
+ {%- if role == 'system' %}
47
+ {{- role_indicators['system'] }}
48
+ {{- msg.content }}
49
+ {%- if tools is defined and tools %}
50
+ {{- "\n\n" }}{{- available_tools(tools) }}
51
+ {%- endif %}
52
+ {{- end_of_turn -}}
53
+ {%- continue %}
54
+ {%- elif tools is defined and tools %}
55
+ {{- role_indicators['system'] }}
56
+ {{- available_tools(tools) }}
57
+ {{- end_of_turn -}}
58
+ {%- endif %}
59
+ {%- endif %}
60
+
61
+ {%- if role == 'assistant' %}
62
+ {{- role_indicators['assistant'] }}
63
+
64
+ {%- if msg.content %}
65
+ {%- if "</think>" in msg.content %}
66
+ {%- set content = msg.content.split('</think>')[-1].strip() %}
67
+ {%- set reasoning_content = msg.content.split('</think>')[0].strip() %}
68
+ {%- if reasoning_content.startswith("<think>") %}
69
+ {%- set reasoning_content = reasoning_content[9:].strip() %}
70
+ {%- endif %}
71
+ {%- else %}
72
+ {%- set content = msg.content %}
73
+ {%- endif %}
74
+
75
+ {%- if msg.reasoning_content %}
76
+ {%- set reasoning_content = msg.reasoning_content %}
77
+ {%- endif %}
78
+
79
+ {%- if (not skip_think and loop.last) and reasoning_content is defined %}
80
+ {{- "<think>\n" }}
81
+ {{- reasoning_content}}
82
+ {{- "\n</think>\n\n" }}
83
+ {%- else %}
84
+ {{- "<think>\n\n</think>\n\n" }}
85
+ {%- endif %}
86
+ {{- content }}
87
+ {%- endif %}
88
+
89
+ {%- if msg.tool_calls %}
90
+ {%- if msg.content %}
91
+ {{- "\n" }}
92
+ {%- else %}
93
+ {{- "<think>\n\n</think>\n\n" }}
94
+ {%- endif %}
95
+ {%- for tool_call in msg.tool_calls %}
96
+ {%- if tool_call.function is defined %}
97
+ {%- set tool_call = tool_call.function %}
98
+ {%- endif %}
99
+
100
+ {%- if tool_call.arguments is defined %}
101
+ {%- set arguments = tool_call.arguments %}
102
+ {%- elif tool_call.parameters is defined %}
103
+ {%- set arguments = tool_call.parameters %}
104
+ {%- else %}
105
+ {{- raise_exception('arguments or parameters are mandatory: ' ~ tool_call) }}
106
+ {%- endif %}
107
+
108
+ {{- "<tool_call>" }}{"name": "{{- tool_call.name }}", "arguments": {{ arguments | tojson(ensure_ascii=False) | safe }}}{{- "</tool_call>" }}
109
+
110
+ {%- if not loop.last %}
111
+ {{- "\n" }}
112
+ {%- endif %}
113
+
114
+ {%- endfor %}
115
+ {%- endif %}
116
+ {{- end_of_turn -}}
117
+
118
+ {%- elif role == "tool" %}
119
+ {%- if i == 0 or messages[i - 1].role != "tool" %}
120
+ {{- role_indicators['tool'] }}
121
+ {%- endif %}
122
+ {%- if msg.content is defined %}
123
+ {{- "<tool_result>" }}{"result": {{ msg.content | tojson(ensure_ascii=False) | safe }}}{{- "</tool_result>" }}
124
+ {%- endif %}
125
+ {%- if loop.last or messages[i + 1].role != "tool" %}
126
+ {{- end_of_turn -}}
127
+ {%- else %}
128
+ {{- "\n" }}
129
+ {%- endif %}
130
+
131
+ {%- else %}
132
+ {{- role_indicators[role] }}
133
+ {{- msg.content }}
134
+ {{- end_of_turn -}}
135
+ {%- endif %}
136
+ {% endfor %}
137
+
138
+
139
+ {%- if add_generation_prompt %}
140
+ {{- role_indicators['assistant'] }}
141
+ {%- if enable_thinking is defined and enable_thinking is true %}
142
+ {{- "<think>\n" }}
143
+ {%- else %}
144
+ {{- "<think>\n\n</think>\n\n" }}
145
+ {%- endif %}
146
+ {%- endif %}