Can you incorporate the Q 8 quantified GGUF model to make it easier to import the model in ollama

#1
by HexEx - opened

The process of merging files is too complicated

Error: pull model manifest: 400: The specified tag is a sharded GGUF. Ollama does not support this yet.

The Q8_0 quantized model is included, and not sharded, so this request and the ollama message is confusing. If you mean a single file, that is not currently possible due to technical limitations (see the FAQ linked from the model page for details).

You can use our download page to concatenate the model: https://hf.tst.eu/ - search for the model name, and then hit download beside the Q8_0 quant, and it will concatenate the parts while downloading.

mradermacher changed discussion status to closed

Sign up or log in to comment