Error when encoding large sentences
#9
by
GuillaumeGrosjean
- opened
Encoding large texts throws an error. When executing the following code:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("dangvantuan/sentence-camembert-large")
model.encode(":"*1000)
It thows the error:
>>> IndexError: index out of range in self
It seems that model.max_seq_length=514.
When explicitly setting model.max_seq_length=512, everything works fine.
Large texts seem to be truncated to 514 tokens by default, but i think should be truncated to 512.