Kimi-K2 Open-Source Model Tool Call Output Format Anomaly: Non-Standard tool_call_id Triggers Parsing Failures Compared with Official Mode
During the deployment of the Kimi-K2 model using sglang, the model's output for tool call results becomes unstable after multiple rounds of tool invocation. For example:
{
"id": "82874411f6fe4051ba2aa5a5fcf22075",
"object": "chat.completion",
"created": 1754875124,
"model": "Kimi-K2-Instruct",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Now let me create the API testing framework. First, I'll create the project structure:<|tool_calls_section_begin|><|tool_call_begin|>call_59adf5614cfe4f4b8a71be54<|tool_call_argument_begin|>{"file_path": "C:\\Users\\api-testing-framework\\requirements.txt", "content": "requests>=2.28.0\npytest>=7.0.0\npytest-html>=4.0.0\npytest-xdist>=3.0.0\npyyaml>=6.0\njsonschema>=4.0.0\nfaker>=15.0.0\npython-dotenv>=1.0.0\njinja2>=3.1.0"}<|tool_call_end|><|tool_calls_section_end|>",
"reasoning_content": null,
"tool_calls": []
},
"logprobs": null,
"finish_reason": "tool_calls",
"matched_stop": null
}
],
"usage": {
"prompt_tokens": 24639,
"total_tokens": 24782,
"completion_tokens": 143,
"prompt_tokens_details": null
}
}
In the above output, call_59adf5614cfe4f4b8a71be54 appears as a tool_call_id instead of following the standard functions.{func_name}:{index} format, causing parsing failures. Compared to the official Kimi-K2-0711-preview model, the open-source model shows significant differences in effectiveness, with the official model demonstrating notably higher accuracy.
{%- for tool_call in message['tool_calls'] -%}
<|tool_call_begin|>functions.{{ tool_call['function']['name'] }}:{{ loop.index }}<|tool_call_argument_begin|>{% if tool_call['function']['arguments'] is string %}{{ tool_call['function']['arguments'] }}{% else %}{{ tool_call['function']['arguments'] | tojson }}{% endif %}<|tool_call_end|>
{%- endfor -%}
If we modify the chat_template.jinja file according to the above changes, it can resolve the issue.
@liopen
I believe the model does return the correct tool_id
, but SGLang failed to capture it. But yes, manually construct the tool_id in the chat template could be a workaround. Just remember that the tool_id index must be global across the entire conversation—loop through every message and every tool call within each message. We should consider applying this fix to prevent similar issues.
@bigeagle
Hi @bigmoyan and @bigeagle , I also encountered this issue in multi-turn tool-calling with long tool context (~25 tool descriptions) using SGLang.
The main issue is that K2 fails to return the tool name (or function name) it wants to call, as shown in the first comment from @liopen .
"content": "Now let me create the API testing framework. First, I'll create the project structure:<|tool_calls_section_begin|><|tool_call_begin|>call_59adf5614cfe4f4b8a71be54<|tool_call_argument_begin|>{"file_path": "C:\\Users\\api-testing-framework\\requirements.txt", "content": "requests>=2.28.0\npytest>=7.0.0\npytest-html>=4.0.0\npytest-xdist>=3.0.0\npyyaml>=6.0\njsonschema>=4.0.0\nfaker>=15.0.0\npython-dotenv>=1.0.0\njinja2>=3.1.0"}<|tool_call_end|><|tool_calls_section_end|>"
You can see between <|tool_call_begin|>
and <|tool_call_argument_begin|>
, K2 outputs the tool call id (call_59adf5614cfe4f4b8a71be54
), whereas the tool name is expected in the following format: functions.{func_name}:{index}
. From this response, the client cannot know which tool K2 wants to use. Besides, this tool call id is a fake one. It is not referencing a previous tool call id.
SGLang is actually using a regex to parse the output and it does not recognize the tool call pattern when the tool call id appears between <|tool_call_begin|>
and <|tool_call_argument_begin|>
. This results in the <|tool_calls_section_begin|>
tokens in the final output text.
I'm trying the modified chat_template.jinja
file to see if it can be mitigated.
@AdvancedMage
big thanks for the PR, but set loop.index
as counter is incorrect (it's message-level counter, while the correct one is conversation level).
we are reaching out SGLang team to see if we can solve this issue.