Can you incorporate the Q 8 quantified GGUF model to make it easier to import the model in ollama
#1
by
HexEx
- opened
The process of merging files is too complicated
Error: pull model manifest: 400: The specified tag is a sharded GGUF. Ollama does not support this yet.
The Q8_0 quantized model is included, and not sharded, so this request and the ollama message is confusing. If you mean a single file, that is not currently possible due to technical limitations (see the FAQ linked from the model page for details).
You can use our download page to concatenate the model: https://hf.tst.eu/ - search for the model name, and then hit download beside the Q8_0 quant, and it will concatenate the parts while downloading.
mradermacher
changed discussion status to
closed