dots.llm1.inst

(Last updated 2025-06-10, may be out of date)

GGUF Q4_0 quant of rednote-hilab/dots.llm1.inst. Not officially supported in llama.cpp. Needs this branch:

git clone https://github.com/Noeda/llama.cpp
git checkout -b dots1

The EOS token is not set correctly. You will need to override it with --override-kv tokenizer.ggml.eos_token_id=int:151649.

Example launch command (not ideal in all cases):

llama.cpp/build/bin/llama-server --host 127.0.0.1 --port 8080 -m dots.llm1.inst-Q4_0.gguf -c 12800 -fa -ctk q8_0 -ctv q8_0 -ngl 10 -ot "shexp=CUDA0" --no-warmup --override-kv tokenizer.ggml.eos_token_id=int:151649

ddh0
/

dots.llm1.inst-GGUF-Q4_0-EXPERIMENTAL

dots.llm1.inst

Model tree for ddh0/dots.llm1.inst-GGUF-Q4_0-EXPERIMENTAL