Based on thirteenbit/madlad400-10b-mt-gguf but models are split using llama-gguf-split.

This way models can be loaded in WASM avoiding browsers 2GB ArrayBuffer size limit.

Downloads last month
4
GGUF
Model size
10.7B params
Architecture
t5
Hardware compatibility
Log In to view the estimation

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support