dots.llm1.inst

(Last updated 2025-06-10, may be out of date)

GGUF Q4_0 quant of rednote-hilab/dots.llm1.inst. Not officially supported in llama.cpp. Needs this branch:

git clone https://github.com/Noeda/llama.cpp
git checkout -b dots1

The EOS token is not set correctly. You will need to override it with --override-kv tokenizer.ggml.eos_token_id=int:151649.

Example launch command (not ideal in all cases):

llama.cpp/build/bin/llama-server --host 127.0.0.1 --port 8080 -m dots.llm1.inst-Q4_0.gguf -c 12800 -fa -ctk q8_0 -ctv q8_0 -ngl 10 -ot "shexp=CUDA0" --no-warmup --override-kv tokenizer.ggml.eos_token_id=int:151649
Downloads last month
234
GGUF
Model size
143B params
Architecture
dots1
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ddh0/dots.llm1.inst-GGUF-Q4_0-EXPERIMENTAL

Quantized
(6)
this model