Starcannon-Unleashed-12B-v1.0-GGUF

Static Quantization of VongolaChouko/Starcannon-Unleashed-12B-v1.0.

This model was converted to GGUF format from VongolaChouko/Starcannon-Unleashed-12B-v1.0 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

I recommend using them with koboldcpp. You can find their latest release here: koboldcpp-1.76

Recommended settings are here: Settings

Download a file (not the whole branch) from below:

Filename	Quant type	File Size	Split	Description
Starcannon-Unleashed-12B-v1.0-FP16.gguf	F16	24.50GB	false	Full F16 weights.
Starcannon-Unleashed-12B-v1.0-Q8_0.gguf	Q8_0	13.02GB	false	Extremely high quality, generally unneeded but max available quant.
Starcannon-Unleashed-12B-v1.0-Q6_K.gguf	Q6_K	10.06GB	false	Very high quality, near perfect, recommended.
Starcannon-Unleashed-12B-v1.0-Q5_K_M.gguf	Q5_K_M	8.73GB	false	High quality, recommended.
Starcannon-Unleashed-12B-v1.0-Q5_K_S.gguf	Q5_K_S	8.52GB	false	High quality, recommended.
Starcannon-Unleashed-12B-v1.0-Q4_K_M.gguf	Q4_K_M	7.48GB	false	Good quality, default size for must use cases, recommended.
Starcannon-Unleashed-12B-v1.0-Q4_K_S.gguf	Q4_K_S	7.12GB	false	Slightly lower quality with more space savings, recommended.
Starcannon-Unleashed-12B-v1.0-Q4_0.gguf	Q4_0	7.09GB	false	Legacy format, generally not worth using over similarly sized formats
Starcannon-Unleashed-12B-v1.0-Q3_K_L.gguf	Q3_K_L	6.56GB	false	Lower quality but usable, good for low RAM availability.
Starcannon-Unleashed-12B-v1.0-Q3_K_M.gguf	Q3_K_M	6.08GB	false	Low quality.
Starcannon-Unleashed-12B-v1.0-Q3_K_S.gguf	Q3_K_S	5.53GB	false	Low quality, not recommended.
Starcannon-Unleashed-12B-v1.0-Q2_K.gguf	Q2_K	4.79GB	false	Very low quality but surprisingly usable.

Instruct

Both ChatML and Mistral should work fine. Personally, I tested this using ChatML. I found that I like the model's responses better when I use this format. Try to test it out and observe which one you like best. :D

Settings

I recommend using these settings: Starcannon-Unleashed-12B-v1.0-ST-Formatting-2024-10-29.json

IMPORTANT: Open Silly Tavern and use "Master Import", which can be found under "A" tab — Advanced Formatting. Replace the "INSERT WORLD HERE" placeholders with the world/universe in which your character belongs to. If not applicable, just remove that part.

Check your User Settings and set "Example Messages Behavior" to "Never include examples", in order to prevent the Examples of Dialogue from getting sent two times in the context. People reported that if not set, this results in <|im_end|> tokens being outputted. Refer to this post for more info.

Temperature 1.15 - 1.25 is good, but lower should also work well, as long as you also tweak the Min P and XTC to ensure the model won't choke. Play around with it to see what suits your taste.

This is a modified version of MarinaraSpaghetti's Mistral-Small-Correct.json, transformed into ChatML.

You can find the original version here: MarinaraSpaghetti/SillyTavern-Settings

To use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is"

./llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048

VongolaChouko
/

Starcannon-Unleashed-12B-v1.0-GGUF