nomic-ai/nomic-embed-code-GGUF · Unable to use these with llama.cpp

May 2

•

{'error': {'message': 'Llama model must be created with embedding=True to call this method', 'type': 'internal_server_error', 'param': None, 'code': None}}

First off, thank you for uploading these! Is this intended? I don't seem to be able to use these GGUF with llama.cpp

I've tried Q_4 and Q_8 variants. Thanks.

Cebtenzzre

Nomic AI org May 2

It seems like you are using the llama-cpp-python server. This model will not be supported by it until they update their llama.cpp dependency, as the version of llama.cpp they are currently using does not read the pooling type from this model, and there is no argument to the server in order to specify it. If you would like this model to be supported by llama-cpp-python, please open an issue there. (At a minimum you will need --embedding true, but that alone will not give you the expected results.)

Please consider using the official llama.cpp server from here: https://github.com/ggml-org/llama.cpp/releases

Cebtenzzre changed discussion status to closed May 2