Starcannon-Unleashed-12B-v1.0-GGUF
Static Quantization of VongolaChouko/Starcannon-Unleashed-12B-v1.0.
This model was converted to GGUF format from VongolaChouko/Starcannon-Unleashed-12B-v1.0 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.
I recommend using them with koboldcpp. You can find their latest release here: koboldcpp-1.76
Recommended settings are here: Settings
Download a file (not the whole branch) from below:
Filename | Quant type | File Size | Split | Description |
---|---|---|---|---|
Starcannon-Unleashed-12B-v1.0-FP16.gguf | F16 | 24.50GB | false | Full F16 weights. |
Starcannon-Unleashed-12B-v1.0-Q8_0.gguf | Q8_0 | 13.02GB | false | Extremely high quality, generally unneeded but max available quant. |
Starcannon-Unleashed-12B-v1.0-Q6_K.gguf | Q6_K | 10.06GB | false | Very high quality, near perfect, recommended. |
Starcannon-Unleashed-12B-v1.0-Q5_K_M.gguf | Q5_K_M | 8.73GB | false | High quality, recommended. |
Starcannon-Unleashed-12B-v1.0-Q5_K_S.gguf | Q5_K_S | 8.52GB | false | High quality, recommended. |
Starcannon-Unleashed-12B-v1.0-Q4_K_M.gguf | Q4_K_M | 7.48GB | false | Good quality, default size for must use cases, recommended. |
Starcannon-Unleashed-12B-v1.0-Q4_K_S.gguf | Q4_K_S | 7.12GB | false | Slightly lower quality with more space savings, recommended. |
Starcannon-Unleashed-12B-v1.0-Q4_0.gguf | Q4_0 | 7.09GB | false | Legacy format, generally not worth using over similarly sized formats |
Starcannon-Unleashed-12B-v1.0-Q3_K_L.gguf | Q3_K_L | 6.56GB | false | Lower quality but usable, good for low RAM availability. |
Starcannon-Unleashed-12B-v1.0-Q3_K_M.gguf | Q3_K_M | 6.08GB | false | Low quality. |
Starcannon-Unleashed-12B-v1.0-Q3_K_S.gguf | Q3_K_S | 5.53GB | false | Low quality, not recommended. |
Starcannon-Unleashed-12B-v1.0-Q2_K.gguf | Q2_K | 4.79GB | false | Very low quality but surprisingly usable. |
Instruct
Both ChatML and Mistral should work fine. Personally, I tested this using ChatML. I found that I like the model's responses better when I use this format. Try to test it out and observe which one you like best. :D
Settings
I recommend using these settings: Starcannon-Unleashed-12B-v1.0-ST-Formatting-2024-10-29.json
IMPORTANT: Open Silly Tavern and use "Master Import", which can be found under "A" tab — Advanced Formatting. Replace the "INSERT WORLD HERE" placeholders with the world/universe in which your character belongs to. If not applicable, just remove that part.
Check your User Settings and set "Example Messages Behavior" to "Never include examples", in order to prevent the Examples of Dialogue from getting sent two times in the context. People reported that if not set, this results in <|im_end|> tokens being outputted. Refer to this post for more info.
Temperature 1.15 - 1.25 is good, but lower should also work well, as long as you also tweak the Min P and XTC to ensure the model won't choke. Play around with it to see what suits your taste.
This is a modified version of MarinaraSpaghetti's Mistral-Small-Correct.json, transformed into ChatML.
You can find the original version here: MarinaraSpaghetti/SillyTavern-Settings
To use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)
brew install llama.cpp
Invoke the llama.cpp server or the CLI.
CLI:
llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is"
Server:
llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048
Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
Step 1: Clone llama.cpp from GitHub.
git clone https://github.com/ggerganov/llama.cpp
Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1
flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run inference through the main binary.
./llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048
- Downloads last month
- 3,210
Model tree for VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF
Base model
VongolaChouko/Starcannon-Unleashed-12B-v1.0