Edit model card

image/png

Starcannon-Unleashed-12B-v1.0-GGUF

Static Quantization of VongolaChouko/Starcannon-Unleashed-12B-v1.0.

This model was converted to GGUF format from VongolaChouko/Starcannon-Unleashed-12B-v1.0 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

I recommend using them with koboldcpp. You can find their latest release here: koboldcpp-1.76

Recommended settings are here: Settings


Download a file (not the whole branch) from below:

Filename Quant type File Size Split Description
Starcannon-Unleashed-12B-v1.0-FP16.gguf F16 24.50GB false Full F16 weights.
Starcannon-Unleashed-12B-v1.0-Q8_0.gguf Q8_0 13.02GB false Extremely high quality, generally unneeded but max available quant.
Starcannon-Unleashed-12B-v1.0-Q6_K.gguf Q6_K 10.06GB false Very high quality, near perfect, recommended.
Starcannon-Unleashed-12B-v1.0-Q5_K_M.gguf Q5_K_M 8.73GB false High quality, recommended.
Starcannon-Unleashed-12B-v1.0-Q5_K_S.gguf Q5_K_S 8.52GB false High quality, recommended.
Starcannon-Unleashed-12B-v1.0-Q4_K_M.gguf Q4_K_M 7.48GB false Good quality, default size for must use cases, recommended.
Starcannon-Unleashed-12B-v1.0-Q4_K_S.gguf Q4_K_S 7.12GB false Slightly lower quality with more space savings, recommended.
Starcannon-Unleashed-12B-v1.0-Q4_0.gguf Q4_0 7.09GB false Legacy format, generally not worth using over similarly sized formats
Starcannon-Unleashed-12B-v1.0-Q3_K_L.gguf Q3_K_L 6.56GB false Lower quality but usable, good for low RAM availability.
Starcannon-Unleashed-12B-v1.0-Q3_K_M.gguf Q3_K_M 6.08GB false Low quality.
Starcannon-Unleashed-12B-v1.0-Q3_K_S.gguf Q3_K_S 5.53GB false Low quality, not recommended.
Starcannon-Unleashed-12B-v1.0-Q2_K.gguf Q2_K 4.79GB false Very low quality but surprisingly usable.

Instruct

Both ChatML and Mistral should work fine. Personally, I tested this using ChatML. I found that I like the model's responses better when I use this format. Try to test it out and observe which one you like best. :D

Settings

I recommend using these settings: Starcannon-Unleashed-12B-v1.0-ST-Formatting-2024-10-29.json

IMPORTANT: Open Silly Tavern and use "Master Import", which can be found under "A" tab — Advanced Formatting. Replace the "INSERT WORLD HERE" placeholders with the world/universe in which your character belongs to. If not applicable, just remove that part.
image/png

Check your User Settings and set "Example Messages Behavior" to "Never include examples", in order to prevent the Examples of Dialogue from getting sent two times in the context. People reported that if not set, this results in <|im_end|> tokens being outputted. Refer to this post for more info.
image/webp

Temperature 1.15 - 1.25 is good, but lower should also work well, as long as you also tweak the Min P and XTC to ensure the model won't choke. Play around with it to see what suits your taste.

This is a modified version of MarinaraSpaghetti's Mistral-Small-Correct.json, transformed into ChatML.

You can find the original version here: MarinaraSpaghetti/SillyTavern-Settings

To use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo VongolaChouko/Starcannon-Unleashed-12B-v1.0-Q6_K-GGUF --hf-file starcannon-unleashed-12b-v1.0-q6_k.gguf -c 2048
Downloads last month
3,210
GGUF
Model size
12.2B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for VongolaChouko/Starcannon-Unleashed-12B-v1.0-GGUF

Quantized
(8)
this model