phate334/multilingual-e5-large-gguf

This model was converted to GGUF format from intfloat/multilingual-e5-large using llama.cpp.

Run it

  • Deploy using Docker
$ docker run -p 8080:8080 -v ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf

or Docker Compose

services:
  e5-f16:
    image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
    ports:
      - 8080:8080
    volumes:
      - ./multilingual-e5-large-f16.gguf:/multilingual-e5-large-f16.gguf
    command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-f16.gguf
  e5-q4:
    image: ghcr.io/ggerganov/llama.cpp:server--b1-4b9afbb
    ports:
      - 8081:8080
    volumes:
      - ./multilingual-e5-large-q4_k_m.gguf:/multilingual-e5-large-q4_k_m.gguf
    command: --host 0.0.0.0 --embedding -m /multilingual-e5-large-q4_k_m.gguf
Downloads last month
28
GGUF
Model size
559M params
Architecture
bert

4-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for phate334/multilingual-e5-large-gguf

Quantized
(10)
this model