This is a GGUF conversion of Google's UMT5 xxl model, specifically the encoder part.

The weights can be used with ./llama-embedding or with the ComfyUI-GGUF custom node together with image/video generation models.

This is a non imatrix quant as llama.cpp doesn't support imatrix creation for T5 models at the time of writing. It's therefore recommended to use Q5_K_M or larger for the best results, although smaller models may also still provide decent results in resource constrained scenarios.

Downloads last month
3,661
GGUF
Model size
5.68B params
Architecture
t5encoder

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for city96/umt5-xxl-encoder-gguf

Base model

google/umt5-xxl
Quantized
(3)
this model