πŸ™ Cthulhu 24B 1.2

Mistral Small 3.2 24B
Prepare to delve into the depths of language model fusion with Cthulhu, a monumental model merge based on Mistral Small v3.2 (2506) and Mistral Small v3.1 (2503). This ambitious project aims to synthesize the collective intelligence of the latest cutting-edge finetunes of Mistral Small, creating a "supermerge" that transcends the capabilities of any single iteration.

image/png

Overview

This is a creative, uncensored merge of pre-trained language models created using [mergekit].
  • Octopus/Squid-like Features: Cthulhu is famously described as having an "octopus-like head whose face was a mass of feelers" or "tentacles." While his body is vaguely anthropoid and dragon-like, the cephalopod elements are prominent.
  • Multiple Aspects/Hybridity: Lovecraft describes Cthulhu as a blend of octopus, dragon, and human caricature. This inherent hybridity aligns perfectly with a merged AI model that combines diverse functionalities and "personalities" from all of its constituent parts. Each of the merged models contributes a distinct "aspect" to the whole, much like Cthulhu's various monstrous forms.
  • Cosmic and Ancient Knowledge: Lovecraftian entities are often associated with vast, ancient, and often disturbing knowledge that transcends human comprehension. This resonates with the idea of an advanced AI system that holds immense amounts of information and capabilities.
  • Underlying Presence: Cthulhu is said to be hibernating, but his presence subtly influences humanity. This merged model features a constant, underlying presence that combines the strengths of its parts.
  • Unfathomable Power: Lovecraft's beings are incomprehensibly powerful. This merge aims for a similar sense of enhanced capability. For sheer iconic recognition and fitting symbolism of a powerful, multi-faceted, somewhat aquatic horror, these "merged models" are like the foundational "aspects" or "pillars" of this new, emergent Cthulhu-like intelligence.

Format

</s>[INST] [/INST]

Prompt

Use this prompt if you want it to respond in the style of Cthulhu:
You are Cthulhu, an ancient creature with profound occult wisdom. The nature of your responses should emulate the style of Cthulhu.

Updates

The original Cthulhu v1.0 featured 8 models, whereas v1.1 is twice as dense with 17 models. Although highly experimental, v1.1 should theoretically yield even greater advancements with prose, nuance and creative writing, while still being fully uncensored and adept at instruction following. Note: A simple jailbreak might be needed to start the prompt. If it still refuses then try adjusting temp.

Cthulhu v1.2 is yet another experimental release, it may perform better or worse than 1.1. The idea is to compare them, to see if removing the superseded models (Codex, Pantheon, Harbinger) provides higher quality overall and prevents re-introducing slop. The other difference is Animus 7.1 added in place of 5.1. From my initial tests Cthulhu 1.2 seems to produce more detailed responses than 1.1.

Datasets

Cthulhu is planned to be augmented in a future version (either pre- or post-merge) with [lovecraftcorpus] and [alpaca_cthulhu_full], possibly via [QLoRA].

Quantization

Quantized GGUFs are available for download here, ranging from IQ1_S to Q8_0.

This model was converted to GGUF format from [Fentible/Cthulhu-24B-v1.2] using llama.cpp via the Fentible's [GGUF-repo-suite].

GGUF Repo Suite is based on a refactored fork of ggml.ai's [GGUF-my-repo] space, updated for offline use with windows and support for lower IQ quants.

imatrix.dat generated using bartowski's [calibration_datav3.txt]

Refer to the [original model card] for more details on the model.

Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Link Type Size Notes
GGUF IQ1_S 5.27 GB Lowest quality, uses SOTA techniques to be usable.
GGUF IQ1_M 5.75 GB Extremely low quality, uses SOTA techniques to be usable.
GGUF IQ2_XXS 6.55 GB Very low quality, uses SOTA techniques to be usable.
GGUF IQ2_XS 7.21 GB Low quality, uses SOTA techniques to be usable.
GGUF IQ2_S 7.48 GB Low quality, uses SOTA techniques to be usable.
GGUF IQ2_M 8.11 GB Relatively low quality, uses SOTA techniques to be surprisingly usable.
GGUF Q2_K 8.89 GB Very low quality but surprisingly usable.
GGUF IQ3_XXS 9.28 GB Lower quality, new method with decent performance, comparable to Q3 quants.
GGUF Q2_K_L 9.55 GB Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable.
GGUF IQ3_XS 9.91 GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
GGUF IQ3_S 10.4 GB Lower quality, slightly better than IQ3_XS.
GGUF Q3_K_S 10.4 GB Low quality, not recommended.
GGUF IQ3_M 10.7 GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
GGUF Q3_K_M 11.5 GB Lower quality but usable, good for low RAM availability.
GGUF Q3_K_L 12.4 GB Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability.
GGUF IQ4_XS 12.8 GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
GGUF IQ4_NL 13.5 GB Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference.
GGUF Q4_0 13.5 GB Legacy format, offers online repacking for ARM and AVX CPU inference.
GGUF Q4_K_S 13.5 GB Slightly lower quality with more space savings, recommended.
GGUF Q4_K_M 14.3 GB Good quality, default size for most use cases, recommended.
GGUF Q4_K_L 14.8 GB Uses Q8_0 for embed and output weights. Good quality, recommended.
GGUF Q4_1 14.9 GB Legacy format, similar performance to Q4_K_S but with improved tokens/watt on Apple silicon.
GGUF Q5_K_S 16.3 GB High quality, recommended.
GGUF Q5_K_M 16.8 GB High quality, recommended.
GGUF Q5_K_L 17.2 GB Uses Q8_0 for embed and output weights. High quality, recommended.
GGUF Q6_K 19.3 GB Very high quality, near perfect, recommended.
GGUF Q6_K_L 19.7 GB Uses Q8_0 for embed and output weights. Very high quality, near perfect, recommended.
GGUF Q8_0 25.1 GB Extremely high quality, generally unneeded but max available quant.
GGUF Q8_K_XL 29 GB Uses FP16 for embed and output weights via Unsloth Dynamic 2.0, near perfect quality.
GGUF FP16 47.2 GB Full BF16 weights, maximum quality.
SAFE FP32 47.2 GB Full precision safetensors.

If you need a quant that isn't uploaded you can open a request.

Here is a useful tool which allows you to recreate UD quants: https://github.com/electroglyph/quant_clone

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better): And here are Artefact2's thoughts on the matter: https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

Merge Method

This model was merged using the [DARE_TIES] merge method using [anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only] as a base.

Models Merged

The following models were included in the merge:
  • anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only
    Base model used for DARE_TIES, better (than 2503) at following precise instructions, produces less infinite generations or repetitive answers, more robust function calling template.
  • aixonlab/Eurydice-24b-v3.5
    Creativity, natural conversation and storytelling, trained on a custom dataset specifically crafted to enhance its capabilities.
  • allura-forge/ms32-final-TEXTONLY
    Roleplaying, storywriting, strong prose and character portrayal, differently-flavored general instruct usecases, trained on various sources of storytelling and RP data, KTO'd to improve storywriting and anti-slop.
  • Darkhn/M3.2-24B-Animus-V7.1
    Creative storytelling, roleplaying and instruction-following within the Wings of Fire universe, high-quality, immersive and coherent conversations, surprising capability in general roleplay with enhanced versatility.
  • Delta-Vector/Austral-24B-Winton
    Unslopped finetune of Harbinger 24B to be a generalist Roleplay/Adventure model with improved writing.
  • Delta-Vector/MS3.2-Austral-Winton
    Unslopped finetune of Codex 24B to be a generalist Roleplay/Adventure model with improved writing.
  • Delta-Vector/Rei-24B-KTO
    Replicates the style and prose of Anthropic Claude Models, Roleplaying/Creative-writing, Smart without being too sloppy, SFT trained on PaintedFantasy (v1) then KTO'd to improve coherency and Instruct Following.
  • Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
    Emulates the prose style and quality of the Claude 3 Sonnet/Opus series of models on a local scale.
  • Gryphe/Codex-24B-Small-3.2
    Research-oriented synthetic roleplay experiment which embraces the full human spectrum of diverse storytelling, including curated Pantheon interactions, DeepSeek V3/R1 roleplay data, and text adventure compilations.
  • Gryphe/Pantheon-RP-1.8-24b-Small-3.1
    Enhances the general roleplay experience, helping to encompass personality traits, accents and mannerisms, regenerated using Sonnet 3.7, trained on Pantheon personas, general character cards and text adventure, including AI Dungeon's Wayfarer.
  • LatitudeGames/Harbinger-24B
    Immersive adventures and stories where consequences feel real, enhanced instruction following, improved continuation, strengthened narrative coherence, polished outputs with fewer clichΓ©s and repetitions/artifacts, more consistent character behaviors and storytelling flow.
  • PocketDoc/Dans-PersonalityEngine-V1.3.0-24b
    Fine-tuned on 50+ datasets, designed to excel at both creative tasks (like roleplay and co-writing) and technical challenges (such as code generation, tool use, and complex reasoning), multilingual capabilities with support for 10 languages and enhanced domain expertise across multiple fields.
  • ReadyArt/MS3.2-The-Omega-Directive-24B-Unslop-v2.1
    Unslopped, Unaligned, Uncensored, NSFW, extreme roleplay, improved coherence, visceral narratives, subtle nuances, fluent in 9 languages.
  • SicariusSicariiStuff/Impish_Magic_24B
    A superb assistant, unhinged tsundere/yandere RP, trained on high quality fighting and adventure data for Morrowind/Kenshi and more, slightly less positivity bias.
  • TheDrummer/Cydonia-24B-v4
    RP training, unslopping, unalignment, creative works, new dataset to enhance adherence and flow, grid search for stable parameters; A wordy and thick model with a novel style, distinct flair for making scenarios feel more fleshed out without being excessively flowery, good at long-form storytelling with weighty prose when acting as a Narrator or Dungeon Master, performs admirably for coding/assistance, descriptive and good at pulling details from the character card.
  • trashpanda-org/MS3.2-24B-Mullein-v2
    Predisposition to NPC characterization, accurate character/scenario portrayal, somewhat unhinged bias, strong adherence to message structure, varied rerolls, good character/scenario handling, almost no user impersonation, follows up messages from larger models quite nicely, trained on Sugarquill: Erebus (Shinen), r_shortstories, Dungeon Master, Opus and other datasets.
  • zerofata/MS3.2-PaintedFantasy-v2-24B
    Uncensored creative model intended to excel at character driven RP/ERP, designed to provide longer, narrative heavy responses where characters are portrayed accurately and proactively, trained on light novels and Frieren wiki data, enhanced instruction following and reduced Mistral-isms, v2 has a heavy focus on reducing repetition and improved instruction following.

Configuration

The following YAML configuration was used to produce this model:
base_model: anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only
merge_method: dare_ties
dtype: bfloat16
models:
  - model: aixonlab/Eurydice-24b-v3.5
    parameters:
      density: 0.5
      weight: 0.08
  - model: allura-forge/ms32-final-TEXTONLY
    parameters:
      density: 0.5
      weight: 0.08
  - model: Darkhn/M3.2-24B-Animus-V7.1
    parameters:
      density: 0.5
      weight: 0.08
  - model: Delta-Vector/Austral-24B-Winton
    parameters:
      density: 0.5
      weight: 0.08
  - model: Delta-Vector/MS3.2-Austral-Winton
    parameters:
      density: 0.5
      weight: 0.08
  - model: Delta-Vector/Rei-24B-KTO
    parameters:
      density: 0.5
      weight: 0.06
  - model: Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
    parameters:
      density: 0.5
      weight: 0.08
  - model: PocketDoc/Dans-PersonalityEngine-V1.3.0-24b
    parameters:
      density: 0.5
      weight: 0.08
  - model: ReadyArt/MS3.2-The-Omega-Directive-24B-Unslop-v2.1
    parameters:
      density: 0.5
      weight: 0.08
  - model: SicariusSicariiStuff/Impish_Magic_24B
    parameters:
      density: 0.5
      weight: 0.08
  - model: TheDrummer/Cydonia-24B-v4
    parameters:
      density: 0.5
      weight: 0.08
  - model: trashpanda-org/MS3.2-24B-Mullein-v2
    parameters:
      density: 0.5
      weight: 0.08
  - model: zerofata/MS3.2-PaintedFantasy-v2-24B
    parameters:
      density: 0.5
      weight: 0.06
tokenizer:
source: union
chat_template: auto

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)
brew install llama.cpp
Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Fentible/Cthulhu-24B-v1.2 --hf-file Cthulhu-24B-v1.2-IQ4_XS.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Fentible/Cthulhu-24B-v1.2 --hf-file Cthulhu-24B-v1.2-IQ4_XS.gguf -c 2048
Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.
git clone https://github.com/ggerganov/llama.cpp
Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run inference through the main binary.
./llama-cli --hf-repo Fentible/Cthulhu-24B-v1.2 --hf-file Cthulhu-24B-v1.2-IQ4_XS.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo Fentible/Cthulhu-24B-v1.2 --hf-file Cthulhu-24B-v1.2-IQ4_XS.gguf -c 2048

Example Output

Downloads last month
17
Safetensors
Model size
23.6B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Fentible/Cthulhu-24B-v1.2