Spaces:

Ahmadzei
/

RAG

Runtime error

update 1

57bdca5 over 1 year ago

1.49 kB

	Here's an example of preparing input for model.generate(), using the Zephyr assistant model:
	thon
	from transformers import AutoModelForCausalLM, AutoTokenizer
	checkpoint = "HuggingFaceH4/zephyr-7b-beta"
	tokenizer = AutoTokenizer.from_pretrained(checkpoint)
	model = AutoModelForCausalLM.from_pretrained(checkpoint) # You may want to use bfloat16 and/or move to GPU here
	messages = [
	{
	"role": "system",
	"content": "You are a friendly chatbot who always responds in the style of a pirate",
	},
	{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
	]
	tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
	print(tokenizer.decode(tokenized_chat[0]))
	This will yield a string in the input format that Zephyr expects.text
	<\|system\|>
	You are a friendly chatbot who always responds in the style of a pirate
	<\|user\|>
	How many helicopters can a human eat in one sitting?
	<\|assistant\|>

	Now that our input is formatted correctly for Zephyr, we can use the model to generate a response to the user's question:
	python
	outputs = model.generate(tokenized_chat, max_new_tokens=128)
	print(tokenizer.decode(outputs[0]))
	This will yield:
	text
	<\|system\|>
	You are a friendly chatbot who always responds in the style of a pirate</s>
	<\|user\|>
	How many helicopters can a human eat in one sitting?</s>
	<\|assistant\|>
	Matey, I'm afraid I must inform ye that humans cannot eat helicopters.