dots.llm1.inst
(Last updated 2025-06-10, may be out of date)
GGUF Q4_0 quant of rednote-hilab/dots.llm1.inst. Not officially supported in llama.cpp. Needs this branch:
git clone https://github.com/Noeda/llama.cpp
git checkout -b dots1
The EOS token is not set correctly. You will need to override it with --override-kv tokenizer.ggml.eos_token_id=int:151649
.
Example launch command (not ideal in all cases):
llama.cpp/build/bin/llama-server --host 127.0.0.1 --port 8080 -m dots.llm1.inst-Q4_0.gguf -c 12800 -fa -ctk q8_0 -ctv q8_0 -ngl 10 -ot "shexp=CUDA0" --no-warmup --override-kv tokenizer.ggml.eos_token_id=int:151649
- Downloads last month
- 234
Hardware compatibility
Log In
to view the estimation
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for ddh0/dots.llm1.inst-GGUF-Q4_0-EXPERIMENTAL
Base model
rednote-hilab/dots.llm1.base
Finetuned
rednote-hilab/dots.llm1.inst