--- license: llama3.1 base_model: - taide/Llama-3.1-TAIDE-LX-8B-Chat pipeline_tag: text2text-generation --- ## Uses It is shit model.... ``` import openvino_genai as ov_genai from openvino_genai import GenerationConfig import huggingface_hub as hf_hub hf_hub.snapshot_download("hsuwill000/Llama-3.1-TAIDE-LX-8B-Chat_int4_ov", local_dir="ov") pipe = ov_genai.LLMPipeline("ov", "CPU") tokenizer = pipe.get_tokenizer() tokenizer.set_chat_template(tokenizer.chat_template) config = GenerationConfig( stop_strings=set(["<|eot_id|>"]) # ✅ 這是 set ) #, "<|end_header_id|>" output_buffer = "" def streamer(subword): global output_buffer output_buffer += subword print(subword, end='', flush=True) pipe.start_chat() while True: try: question = input('question:\n') # 手動構建 prompt(模仿 ChatML 或 LLaMA 3 instruct 風格) prompt = ( "<|user|>\n" + question + "\n<|eot_id|>" ) except EOFError: break output_buffer = "" # 清空累積 buffer pipe.generate(prompt, streamer=streamer, max_new_tokens=4096, config=config) print('\n----------\n') pipe.finish_chat() ```