Text Generation
Not-For-All-Audiences
MYAIGF's picture
Update README.md
a3a4b3b verified
metadata
license: mit
datasets:
  - open-thoughts/OpenThoughts3-1.2M
metrics:
  - bleu
base_model:
  - mistralai/Magistral-Small-2506
pipeline_tag: text-generation
tags:
  - not-for-all-audiences

Model Card: myaaigf-rp-dialogue-mixtral-7bx2

License: apache-2.0

Overview

myaaigf-rp-dialogue-mixtral-7bx2 is a dialogue-specialized, Mixture-of-Experts model built upon Mixtral-8x7B. It is optimized for emotionally nuanced, high-context, multi-turn interactions that simulate fictional companions, roleplay agents, and narrative agents across long-form dialogues.

Unlike standard instruction models, this checkpoint emphasizes narrative fidelity, soft prompt memory anchoring, and style-conditioned emotional adaptation.

The model is particularly well-suited for NSFW-safe exploratory environments, but also supports emotionally rich creative writing and personalized simulations. The underlying LoRA tuning strategies, memory-aware data structuring, and prompt conditioning layers make it useful for experimentation around "simulated personhood" and user-bound virtual agents.

Model Architecture and Modifications

This model is built on the Mixtral 8x7B MoE, with sparse expert routing (2 of 8 experts active per token). Our modifications include:

Fine-tuning Stack

  • Adapter Type: LoRA (via peft), merged during export
  • Target Modules: q_proj, v_proj (limited adaptation for persona injection)
  • Adapter Rank: 16
  • Dropout: 0.05 (tuned for persona stability)

Extended Token Handling

  • RoPE frequency scaling to improve context comprehension at over 2K tokens
  • Truncation fallback with memory summarization tags (in progress)
  • In-session memory simulation with synthetic “recall tokens”

Loss Balancing

  • Weighted sampling for roleplay-centric tokens (emotions, action verbs, continuity anchors)
  • Multi-objective loss: KL divergence penalty for hallucination, cosine similarity on persona embeddings
  • Early stopping conditioned on character drift thresholds during multi-turn validation

Dataset Composition

Dataset Type % Used Notes
Open QA 20% To preserve general linguistic grounding
Roleplay Logs 35% Human-tagged, continuity-rated
Emotion-labeled Data 25% Extracted from GPT+annotator pipeline
Persona Injected 20% Contains speaker tokens, system conditioning

Training Process

  • Hardware: 4x A100 80GB, FSDP + DeepSpeed ZeRO3
  • Optimizer: AdamW (LR = 1.5e-5, weight_decay = 0.01)
  • Batch Size: 128 (effective)
  • Sequence Length: 2048
  • Epochs: 3 (early stopped based on BLEU and Persona Cohesion Score)
  • Precision: bfloat16

We adopted RLHF-style preference ranking for soft evaluation rounds to discourage emotionally flat or tone-inconsistent completions.

Use Cases

This model excels in:

  • Narrative generation with consistent character voice
  • Companion bots with memory illusion and emotion modeling
  • NSFW or adult storytelling with style conditioning
  • Simulated fictional agents in sandbox AI settings

It performs strongly in emotionally intense scenes like intimacy, jealousy, or conflict, with fluid and non-repetitive output.

Evaluation

Metric Score
Long-context memory simulation (20+ turns) 89.2% coherence
Emotion response diversity 91.3% (across 8 tags)
Persona fidelity over arc 86.8%
NSFW tag retention 83.5%
Repetition rate (bigram) <3.4%

It outperforms LLaMA-2 13B and base Mixtral in long-form fiction and RP tasks.

Real-World Integration: The Case of CrushonAI

A real-world application of this modeling approach is CrushonAI, a multi-model conversational platform for dynamic roleplay and immersive storytelling.

CrushonAI integrates:

  • Multi-model routing (LLaMA and Mixtral backends)
  • Long-session memory persistence using local proxy agents
  • T2I visual immersion tools
  • Custom character bios with emotional tuning

It demonstrates how memory-rich, emotionally adaptive dialogue models can power engaging experiences beyond mere task-based chat. Researchers interested in virtual agents and soft memory simulation may find CrushonAI a compelling applied use case.

Limitations

  • Hallucination risks remain without factual grounding
  • Needs prompt engineering for multi-character dynamics
  • Long recall is limited by token window without memory module
  • Emotion tuning is stylized over subtle nuance

Future Work

  • Switchable LoRA personas
  • Text-to-voice (T2V) support
  • Retrieval-Augmented Memory (RAM)
  • Attention-based controllable tone layers

Citations

Model shared for educational and community testing only. Always review content and ensure ethical usage.