THUDM/GLM-4.1V-9B-Thinking · Update chat

weege007

4 days ago

don't to merge, just a pr to download for huggingface-cli revision

Update chat_template.jinja674b1a1d

ZhangZhikang

4 days ago

But this method seems unstable, and sometimes it still outputs tags.

weege007

4 days ago

•

edited 4 days ago

maybe filter tags like this:

        is_output_think = self.args.lm_gen_think_output
        is_thinking = False
        is_answer = True
        think_text = ""
        for new_text in streamer:
            times.append(perf_counter() - start)
            if "<think>" in new_text:
                yield "思考中，请稍等。"
                is_thinking = True
                think_text = ""
                think_text += new_text
                continue
            if "</think>" in new_text:
                is_thinking = False
                think_text += new_text
                logging.info(f"{think_text=}")
                think_text = ""
                new_text = new_text.replace("</think>", "")
            if is_thinking is True:
                think_text += new_text
                if is_output_think is True:
                    generated_text += new_text
                    yield new_text
                else:
                    yield None
                continue

            if "<answer>" in new_text:
                is_answer = True
                new_text = new_text.replace("<answer>", "")
            if "</answer>" in new_text:
                is_answer = False
                continue

            if is_answer is True:
                generated_text += new_text
                yield new_text
            start = perf_counter()
        yield "."  # end the sentence for downstream process sentence, e.g.: tts
        logging.info(f"{generated_text=} TTFT: {times[0]:.4f}s total time: {sum(times):.4f}s")
        torch.cuda.empty_cache()
        self._chat_history.append(
            {"role": "assistant", "content": [{"type": "text", "text": generated_text}]}
        )

filter tag like <answer> </answer> <think> </think>

ZAHNGYUXUAN

Z.ai & THUKEG org 4 days ago

Alright, I won't merge, but it seems that 'and' seems to be filtering the span tags, actually, these tags can be specially handled in the frontend rendering.