Ollama version doesn't properly truncate tokens to 512 max

#14
by shuaiscott - opened

When using the official Ollama model of snowflake-arctic-embed-l (latest/335m - 21ab8b9b0545), if input is greater than 512 tokens, instead of truncating, the model encounters an error somewhere and returns only [0,0,0...] embeddings.

I've checked my Ollama parameters and this occurs when "truncate": true. Other embedding models properly truncates the input and I see the INFO log in Ollama say "input truncated". I don't see this message with snowflake-arctic-embed-l.

When "truncate" is set to false, I get the expected "input length exceeds maximum context length".

Also just leaving a thanks for building these embedding models!

I'm not super familiar with truncation in Ollama -- the Ollama version of this model is provided by the Ollama community, not Snowflake. You may want to raise this issue on their GitHub issues.

Sign up or log in to comment