Infill example broken?
#2
by
muelletm
- opened
Running the example from the README i get this:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen2-1B")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen2-1B", trust_remote_code=True, revision="main")
def format(prefix, suffix):
return prefix + "<mask_1>" + suffix + "<|endoftext|>" + "<sep>" + "<mask_1>"
prefix = "def hello_world():\n "
suffix = " return name"
text = format(prefix, suffix)
input_ids = tokenizer(text, return_tensors="pt").input_ids
generated_ids = model.generate(input_ids, max_length=128)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=False)[len(text):])
def hello_world<eom><|endoftext|><|python|>#
I also tested with the 7B model the output was something like return "Hello World"
which is also not ideal given that the following line is return name
.
I am wondering if these models are not good at infilling or if there is maybe a problem with the prompt construction.
Here is a colab to reproduce:
https://colab.research.google.com/drive/1UZquOlGviRlV5xByenbs-A1GGAi_YTbs
Cheers!