Based on thirteenbit/madlad400-10b-mt-gguf
but models are split using llama-gguf-split
.
This way models can be loaded in WASM avoiding browsers 2GB ArrayBuffer size limit.
- Downloads last month
- 4
Hardware compatibility
Log In
to view the estimation
3-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support