Commit
·
dc500f0
1
Parent(s):
0806c70
Update llama.cpp support
Browse files- README.md +4 -7
- chat_template.jinja +146 -0
README.md
CHANGED
@@ -20,6 +20,7 @@ library_name: transformers
|
|
20 |
<img src="assets/EXAONE_Symbol+BI_3d.png", width="300", style="margin: 40 auto;">
|
21 |
🎉 License Updated! We are pleased to announce our more flexible licensing terms 🤗
|
22 |
<br>✈️ Try on <a href="https://friendli.ai/suite/~/serverless-endpoints/LGAI-EXAONE/EXAONE-4.0-32B/overview">FriendliAI</a>
|
|
|
23 |
<br>
|
24 |
|
25 |
# EXAONE-4.0-1.2B-GGUF
|
@@ -53,11 +54,7 @@ For more details, please refer to our [technical report](https://arxiv.org/abs/2
|
|
53 |
### llama.cpp
|
54 |
You can run EXAONE models locally using llama.cpp by following these steps:
|
55 |
|
56 |
-
1. Install the latest version of llama.cpp
|
57 |
-
|
58 |
-
```bash
|
59 |
-
git clone --single-branch -b add-exaone4 https://github.com/lgai-exaone/llama.cpp.git
|
60 |
-
```
|
61 |
|
62 |
2. Download the EXAONE 4.0 model weights in GGUF format.
|
63 |
|
@@ -108,12 +105,12 @@ git clone --single-branch -b add-exaone4 https://github.com/lgai-exaone/llama.cp
|
|
108 |
<details>
|
109 |
<summary>OpenAI compatible server with `llama-server`</summary>
|
110 |
|
111 |
-
3. Run llama-server with EXAONE 4.0 Jinja template.
|
112 |
```bash
|
113 |
llama-server -m EXAONE-4.0-32B-Q4_K_M.gguf \
|
114 |
-c 131072 -fa -ngl 64 \
|
115 |
--temp 0.6 --top-p 0.95 \
|
116 |
-
--jinja --chat-template-format
|
117 |
--host 0.0.0.0 --port 8820 \
|
118 |
-a EXAONE-4.0-32B-Q4_K_M
|
119 |
```
|
|
|
20 |
<img src="assets/EXAONE_Symbol+BI_3d.png", width="300", style="margin: 40 auto;">
|
21 |
🎉 License Updated! We are pleased to announce our more flexible licensing terms 🤗
|
22 |
<br>✈️ Try on <a href="https://friendli.ai/suite/~/serverless-endpoints/LGAI-EXAONE/EXAONE-4.0-32B/overview">FriendliAI</a>
|
23 |
+
<br><br><i>📢 EXAONE 4.0 is officially supported by llama.cpp! Please check the guide <a href="#quickstart-gguf">below</a></i>
|
24 |
<br>
|
25 |
|
26 |
# EXAONE-4.0-1.2B-GGUF
|
|
|
54 |
### llama.cpp
|
55 |
You can run EXAONE models locally using llama.cpp by following these steps:
|
56 |
|
57 |
+
1. Install the latest version of llama.cpp (version >= `b5932`). Please check the official [installation guide](https://github.com/ggml-org/llama.cpp?tab=readme-ov-file#quick-start) from llama.cpp.
|
|
|
|
|
|
|
|
|
58 |
|
59 |
2. Download the EXAONE 4.0 model weights in GGUF format.
|
60 |
|
|
|
105 |
<details>
|
106 |
<summary>OpenAI compatible server with `llama-server`</summary>
|
107 |
|
108 |
+
3. Run llama-server with EXAONE 4.0 Jinja template. You can find the [chat template file](https://huggingface.co/LGAI-EXAONE/EXAONE-4.0-1.2B-GGUF/blob/main/chat_template.jinja) in this repository.
|
109 |
```bash
|
110 |
llama-server -m EXAONE-4.0-32B-Q4_K_M.gguf \
|
111 |
-c 131072 -fa -ngl 64 \
|
112 |
--temp 0.6 --top-p 0.95 \
|
113 |
+
--jinja --chat-template-format chat_template.jinja \
|
114 |
--host 0.0.0.0 --port 8820 \
|
115 |
-a EXAONE-4.0-32B-Q4_K_M
|
116 |
```
|
chat_template.jinja
ADDED
@@ -0,0 +1,146 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{%- if not skip_think is defined %}
|
2 |
+
{%- set skip_think = true %}
|
3 |
+
{%- endif %}
|
4 |
+
|
5 |
+
{%- set role_indicators = {
|
6 |
+
'user': '[|user|]\n',
|
7 |
+
'assistant': '[|assistant|]\n',
|
8 |
+
'system': '[|system|]\n',
|
9 |
+
'tool': '[|tool|]\n'
|
10 |
+
} %}
|
11 |
+
{%- set end_of_turn = '[|endofturn|]\n' %}
|
12 |
+
|
13 |
+
|
14 |
+
{%- macro available_tools(tools) %}
|
15 |
+
{{- "# Available Tools" }}
|
16 |
+
{{- "\nYou can use none, one, or multiple of the following tools by calling them as functions to help with the user’s query." }}
|
17 |
+
{{- "\nHere are the tools available to you in JSON format within <tool> and </tool> tags:\n" }}
|
18 |
+
{%- for tool in tools %}
|
19 |
+
{{- "<tool>" }}
|
20 |
+
{{- tool | tojson(ensure_ascii=False) | safe }}
|
21 |
+
{{- "</tool>\n" }}
|
22 |
+
{%- endfor %}
|
23 |
+
|
24 |
+
{{- "\nFor each function call you want to make, return a JSON object with function name and arguments within <tool_call> and </tool_call> tags, like:" }}
|
25 |
+
{{- "\n<tool_call>{\"name\": function_1_name, \"arguments\": {argument_1_name: argument_1_value, argument_2_name: argument_2_value}}</tool_call>" }}
|
26 |
+
{{- "\n<tool_call>{\"name\": function_2_name, \"arguments\": {...}}</tool_call>\n..." }}
|
27 |
+
{{- "\nNote that if no argument name is specified for a tool, you can just print the argument value directly, without the argument name or JSON formatting." }}
|
28 |
+
{%- endmacro %}
|
29 |
+
|
30 |
+
|
31 |
+
{%- set ns = namespace(last_query_index = messages|length - 1) %}
|
32 |
+
{%- for message in messages %}
|
33 |
+
{%- if message.role == "user" and message.content is string %}
|
34 |
+
{%- set ns.last_query_index = loop.index0 -%}
|
35 |
+
{%- endif %}
|
36 |
+
{%- endfor %}
|
37 |
+
|
38 |
+
{%- for i in range(messages | length) %}
|
39 |
+
{%- set msg = messages[i] %}
|
40 |
+
{%- set role = msg.role %}
|
41 |
+
{% if role is not none and role.class is not none and role not in role_indicators %}
|
42 |
+
{{- raise_exception('Unknown role: ' ~ role) }}
|
43 |
+
{%- endif %}
|
44 |
+
|
45 |
+
{%- if i == 0 %}
|
46 |
+
{%- if role == 'system' %}
|
47 |
+
{{- role_indicators['system'] }}
|
48 |
+
{{- msg.content }}
|
49 |
+
{%- if tools is defined and tools %}
|
50 |
+
{{- "\n\n" }}{{- available_tools(tools) }}
|
51 |
+
{%- endif %}
|
52 |
+
{{- end_of_turn -}}
|
53 |
+
{%- continue %}
|
54 |
+
{%- elif tools is defined and tools %}
|
55 |
+
{{- role_indicators['system'] }}
|
56 |
+
{{- available_tools(tools) }}
|
57 |
+
{{- end_of_turn -}}
|
58 |
+
{%- endif %}
|
59 |
+
{%- endif %}
|
60 |
+
|
61 |
+
{%- if role == 'assistant' %}
|
62 |
+
{{- role_indicators['assistant'] }}
|
63 |
+
|
64 |
+
{%- if msg.content %}
|
65 |
+
{%- if "</think>" in msg.content %}
|
66 |
+
{%- set content = msg.content.split('</think>')[-1].strip() %}
|
67 |
+
{%- set reasoning_content = msg.content.split('</think>')[0].strip() %}
|
68 |
+
{%- if reasoning_content.startswith("<think>") %}
|
69 |
+
{%- set reasoning_content = reasoning_content[9:].strip() %}
|
70 |
+
{%- endif %}
|
71 |
+
{%- else %}
|
72 |
+
{%- set content = msg.content %}
|
73 |
+
{%- endif %}
|
74 |
+
|
75 |
+
{%- if msg.reasoning_content %}
|
76 |
+
{%- set reasoning_content = msg.reasoning_content %}
|
77 |
+
{%- endif %}
|
78 |
+
|
79 |
+
{%- if (not skip_think and loop.last) and reasoning_content is defined %}
|
80 |
+
{{- "<think>\n" }}
|
81 |
+
{{- reasoning_content}}
|
82 |
+
{{- "\n</think>\n\n" }}
|
83 |
+
{%- else %}
|
84 |
+
{{- "<think>\n\n</think>\n\n" }}
|
85 |
+
{%- endif %}
|
86 |
+
{{- content }}
|
87 |
+
{%- endif %}
|
88 |
+
|
89 |
+
{%- if msg.tool_calls %}
|
90 |
+
{%- if msg.content %}
|
91 |
+
{{- "\n" }}
|
92 |
+
{%- else %}
|
93 |
+
{{- "<think>\n\n</think>\n\n" }}
|
94 |
+
{%- endif %}
|
95 |
+
{%- for tool_call in msg.tool_calls %}
|
96 |
+
{%- if tool_call.function is defined %}
|
97 |
+
{%- set tool_call = tool_call.function %}
|
98 |
+
{%- endif %}
|
99 |
+
|
100 |
+
{%- if tool_call.arguments is defined %}
|
101 |
+
{%- set arguments = tool_call.arguments %}
|
102 |
+
{%- elif tool_call.parameters is defined %}
|
103 |
+
{%- set arguments = tool_call.parameters %}
|
104 |
+
{%- else %}
|
105 |
+
{{- raise_exception('arguments or parameters are mandatory: ' ~ tool_call) }}
|
106 |
+
{%- endif %}
|
107 |
+
|
108 |
+
{{- "<tool_call>" }}{"name": "{{- tool_call.name }}", "arguments": {{ arguments | tojson(ensure_ascii=False) | safe }}}{{- "</tool_call>" }}
|
109 |
+
|
110 |
+
{%- if not loop.last %}
|
111 |
+
{{- "\n" }}
|
112 |
+
{%- endif %}
|
113 |
+
|
114 |
+
{%- endfor %}
|
115 |
+
{%- endif %}
|
116 |
+
{{- end_of_turn -}}
|
117 |
+
|
118 |
+
{%- elif role == "tool" %}
|
119 |
+
{%- if i == 0 or messages[i - 1].role != "tool" %}
|
120 |
+
{{- role_indicators['tool'] }}
|
121 |
+
{%- endif %}
|
122 |
+
{%- if msg.content is defined %}
|
123 |
+
{{- "<tool_result>" }}{"result": {{ msg.content | tojson(ensure_ascii=False) | safe }}}{{- "</tool_result>" }}
|
124 |
+
{%- endif %}
|
125 |
+
{%- if loop.last or messages[i + 1].role != "tool" %}
|
126 |
+
{{- end_of_turn -}}
|
127 |
+
{%- else %}
|
128 |
+
{{- "\n" }}
|
129 |
+
{%- endif %}
|
130 |
+
|
131 |
+
{%- else %}
|
132 |
+
{{- role_indicators[role] }}
|
133 |
+
{{- msg.content }}
|
134 |
+
{{- end_of_turn -}}
|
135 |
+
{%- endif %}
|
136 |
+
{% endfor %}
|
137 |
+
|
138 |
+
|
139 |
+
{%- if add_generation_prompt %}
|
140 |
+
{{- role_indicators['assistant'] }}
|
141 |
+
{%- if enable_thinking is defined and enable_thinking is true %}
|
142 |
+
{{- "<think>\n" }}
|
143 |
+
{%- else %}
|
144 |
+
{{- "<think>\n\n</think>\n\n" }}
|
145 |
+
{%- endif %}
|
146 |
+
{%- endif %}
|