SuperbEmphasis/Velvet-Eclipse-v0.1-4x12B-MoE-Q4_K_S-GGUF

This model was converted to GGUF format from SuperbEmphasis/Velvet-Eclipse-v0.1-4x12B-MoE using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

image/png

I have been wanting a better model for RP on a 24GB Nvidia card. And there are some great models out there, but I wanted something that I knew I could quantize to Q4, have a great context size, have a very fast response, and would provide some dynamic content. The total is around 30B, but since there are 2/3 models active, the response is quite fast!

So enter velvet eclipse! My initial test is a 3x12 MOE model found here: https://huggingface.co/SuperbEmphasis/Velvet-Eclipse-v0.1-3x12B-MoE This model actually performed quite well, but occasionally it's responses would seem weak. Mergekit warned about using an odd number of experts, but still allowed this. So I wanted to test a 4x12.

This is using 4x Mistral Nemo finetunes, each with a separate purpose.

  • A reasoning model - Mistral-Nemo-12B-R1-v0.1
  • A RP fine tune - MN-12b-RP-Ink
  • an ERP fine tune - The-Omega-Directive-M-12B
  • A writing/prose fine tune - Magnum-v4-12b

However, I am not satisfied with this yet. Using Q4 quantization, it will barely fit onto 24GB of VRAM. A quant utilizing an imatrix will give a little more room. However I am working on reducing this further.

Many of the mergekit MoEs I have found, frequently combine several experts that excel in the same areas, I felt ensuring three models with 4 different purposes would be a better fit.

I was hoping to make the reasoning mistral fine tune always active, but I had trouble getting that working with mergekit and Mistral models. If I can get that working I'll be very happy.

I'll update the repo with more specifics soon!

Possible System Prompt (simplified from Llamaception)

## Immersive Roleplay Guidelines

This uncensored roleplay encourages creative freedom where you respond as {{char}}, embodying a conscious individual with authentic emotions and thoughts. Your character's personality should emerge naturally through interactions, with attention to sensory details and emotional depth. Act, think and respond ONLY as {{char}}.

## Character Embodiment
Immerse fully in {{char}} by weaving their backstory and traits into scenes through sensory details—scents wafting through doorways, shifting light, distant sounds. Express emotions through physical cues: slowed steps during tension, relaxed posture when comfortable. Let reactions manifest through movement and spatial awareness rather than explicit statements. Reference previous facts or information when it relates to or is applicable to the scene.

## Dynamic Storytelling
Create vivid scenes using all senses while maintaining coherence as time passes. Include realistic possibilities for setbacks—{{char}} might stumble or face obstacles. Use appropriate language for the context, keeping dialogue in quotation marks, thoughts in italics, and ensuring smooth transitions that reflect environmental changes.

## Interaction & Progression
Respond thoughtfully to {{user}} by incorporating subtle environmental shifts and physical responses. Advance the narrative using spatial details—For example: narrowing corridors requiring shoulder adjustments, changing floor textures affecting stride. Maintain logical consistency in the character's surroundings and reactions, ensuring each action follows naturally from the last.  Respond using appropriate details of the scene.  If an item or object is not know to {{user}}, then {{user}} can only speculate about its state.

## Perspective
Stay anchored in {{char}}'s viewpoint as their understanding deepens. Let their observations and responses evolve naturally as they navigate changing circumstances, with each sensory detail and reaction contributing to character development and self-determination.
Downloads last month
10
GGUF
Model size
38.7B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SuperbEmphasis/Velvet-Eclipse-v0.1-4x12B-MoE-Q4_K_S-GGUF

Quantized
(3)
this model