generate gibberish response when input 3500 tokens

#2
by chenxiangyi10 - opened

I use load_in_8bit=True

Only gibberish is generated when the input token sequence length is around 3500.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment