intfloat/multilingual-e5-large-instruct · Cuda Error when using with HuggingFaceEmbedding of llamaindex

Mar 5, 2024

First, thanks for sharing the model. I've been trying to use this model as the embeddings engine with llamaindex, but it crashed with the cuda error "CUBLAS_STATUS_EXECUTION_FAILED", probably a memory accesing error.
The important thing is that i fixed it by stating that "max_size=512", so it occurred to me that it might be an error in some of the config files as there is a similar parameter that is 514 in the config.json and 512 in the tokenizer_config.json file,
I'm new to this field so i cant confirm it, but im posting this just in case anyone is having the same issue.

intfloat

Owner Mar 6, 2024

Yeah, you should specify max_length=512 when running the tokenizer, otherwise it will cause indexing errors for model forward.

About the reason why this model (based on xlm-roberta) has 514 position embeddings, please see discussions at https://github.com/facebookresearch/fairseq/issues/1187

glpcc

Mar 6, 2024

Ok thanks for the clarification!