<thought> was not being invoked when typing Korean.

by JDNOH - opened Mar 18

Mar 18

English input works, but when I input a query sentence in Korean, the response was missing the sentence. Does the Exaone-Deep model only support English?

nuxlear

LG AI Research org Mar 18

Hello, @JDNOH ! Thank you for asking.

By default, EXAONE-Deep uses a generation prompt [|assistant|]<thought>\n to ensure reasoning steps in the generation process.

To help us better understand the issue, could you share the input and the output (if too long, just part of it) from your test case?

JDNOH

Mar 18

•

edited Mar 18

!!!!!!!!!!!!!!!!!!!!! INPUT !!!!!!!!!!!!!!!!!!!!!
import time
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
from threading import Thread

model_name = "/downloads/EXAONE-Deep-7.8B"
streaming = True    # choose the streaming option

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "user", "content": "학교 버스에는 골프공이 몇 개나 들어갈 수 있나요?"}
]

input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

start_time = time.time()
if streaming:
    streamer = TextIteratorStreamer(tokenizer)
    thread = Thread(target=model.generate, kwargs=dict(       
           input_ids=input_ids.to("cuda"),
          eos_token_id=tokenizer.eos_token_id,
          max_new_tokens=32768,
          do_sample=True,
          temperature=0.6,
          top_p=0.95,
          streamer=streamer
    ))
    thread.start()

    for text in streamer:
        print(text, end="", flush=True)

!!!!!!!!!!!!!!!!!!!!! OUTPUT !!!!!!!!!!!!!!!!!!!!!
[|system|][|endofturn|]
[|user|]학교 버스에는 골프공이 몇 개나 들어갈 수 있나요?
[|assistant|]<thought>

</thought>

학교 버스에 골프공이 몇 개나 들어갈 수 있는지에 대한 정확한 숫자는 버스의 크기와 형태에 따라 다르기 때문에 단정적으로 말씀드리기 어렵습니다. 일반적으로 버스의 내부 공간을 계산할 때는 승객용 좌석 공간과 짐을 싣는 공간을 고려해야 합니다. 골프공은 비교적 큰 물건이기 때문에, 버스의 짐칸이나 특정 공간에 따라 최대 10~20개 정도가 들어갈 수 있을 것으로 추정됩니다. 하지만 이는 버스의 크기와 골프공의 크기에 따라 달라질 수 있으니, 구체적인 버스 모델이나 골프공의 크기를 알고 있다면 더 정확한 수치를 얻을 수 있을 것입니다.[|endofturn|]

nuxlear

LG AI Research org Mar 18

Thank you for sharing the example.

I noticed that the tokenizer does not generate <thought> after [|assistant|]. This might be related to the chat template configuration in your tokenizer_config.json file (located at the bottom of the content).
Here's the valid chat template for EXAONE-Deep models:

{% for message in messages %}{% if loop.first and message['role'] != 'system' %}{{ '[|system|][|endofturn|]\\n' }}{% endif %}{% set content = message['content'] %}{% if '</thought>' in content %}{% set content = content.split('</thought>')[-1].lstrip('\\n') %}{% endif %}{{ '[|' + message['role'] + '|]' + content }}{% if not message['role'] == 'user' %}{{ '[|endofturn|]' }}{% endif %}{% if not loop.last %}{{ '\\n' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '\\n[|assistant|]<thought>\\n' }}{% endif %}

As you can see at the end of the template, the generation prompt \n[|assistant|]<thought>\n is included to ensure proper reasoning steps.

nuxlear

LG AI Research org Mar 18

I haven't noticed that you edit the output for missing <thought>.

I'll review your example and get back to you once we have more information.

LG-AI-EXAONE

LG AI Research org Mar 18

This comment has been hidden (marked as Off-Topic)

nuxlear

LG AI Research org Mar 18

Thank you for waiting, @JDNOH .

As described in Usage Guidelines, when evaluating CSAT Math 2025, a question was followed by the instruction, "Please reason step by step, and put your final answer within \boxed{}." in prompt. We observed that the EXAONE Deep model demonstrated the expected performance. Thus, we recommend adding the instruction to your prompt. However, the optimal prompt may vary depending on the question type.

We would be happy if you found a better prompt for your use cases and shared it with the community.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment