Using llama.cpp for GGUF conversion.

Original model: HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5

Run them directly with llama.cpp:

./llama-embedding \
  --batch-size 512 \
  --ctx-size 512 \
  -m KaLM-embedding-multilingual-mini-instruct-v1.5-GGUF/model.f32.gguf \
  --pooling mean \
  -p "this is a test sentence for llama cpp"

It is important to note that this model uses the mean pooling method, so the --pooling parameter needs to be specified as mean when invoking it.

Our tests on LM Studio have not yet been successful, and it is unclear whether this is related to the default pooling method used by LM Studio.

If any developers are familiar with how to specify the pooling method for embedding models in LM Studio, we welcome you to contact us for further discussion via the email: [email protected]

Downloads last month
318
GGUF
Model size
494M params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5-GGUF

Collection including HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5-GGUF