MYAIGF
/

mixtral-nsfw-roleplay

Text Generation

Not-For-All-Audiences

Model card Files Files and versions

mixtral-nsfw-roleplay / README.md

MYAIGF's picture

Update README.md

a3a4b3b verified about 2 months ago

|

history blame contribute delete

5.09 kB

	---
	license: mit
	datasets:
	- open-thoughts/OpenThoughts3-1.2M
	metrics:
	- bleu
	base_model:
	- mistralai/Magistral-Small-2506
	pipeline_tag: text-generation
	tags:
	- not-for-all-audiences
	---
	# Model Card: `myaaigf-rp-dialogue-mixtral-7bx2`
	License: apache-2.0

	## Overview

	`myaaigf-rp-dialogue-mixtral-7bx2` is a dialogue-specialized, Mixture-of-Experts model built upon [Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1). It is optimized for emotionally nuanced, high-context, multi-turn interactions that simulate fictional companions, roleplay agents, and narrative agents across long-form dialogues.

	Unlike standard instruction models, this checkpoint emphasizes narrative fidelity, soft prompt memory anchoring, and style-conditioned emotional adaptation.

	The model is particularly well-suited for NSFW-safe exploratory environments, but also supports emotionally rich creative writing and personalized simulations. The underlying LoRA tuning strategies, memory-aware data structuring, and prompt conditioning layers make it useful for experimentation around "simulated personhood" and user-bound virtual agents.

	## Model Architecture and Modifications

	This model is built on the Mixtral 8x7B MoE, with sparse expert routing (2 of 8 experts active per token). Our modifications include:

	### Fine-tuning Stack

	- Adapter Type: LoRA (via `peft`), merged during export
	- Target Modules: q_proj, v_proj (limited adaptation for persona injection)
	- Adapter Rank: 16
	- Dropout: 0.05 (tuned for persona stability)

	### Extended Token Handling

	- RoPE frequency scaling to improve context comprehension at over 2K tokens
	- Truncation fallback with memory summarization tags (in progress)
	- In-session memory simulation with synthetic “recall tokens”

	### Loss Balancing

	- Weighted sampling for roleplay-centric tokens (emotions, action verbs, continuity anchors)
	- Multi-objective loss: KL divergence penalty for hallucination, cosine similarity on persona embeddings
	- Early stopping conditioned on character drift thresholds during multi-turn validation

	## Dataset Composition

	\| Dataset Type \| % Used \| Notes \|
	\|--------------\|--------\|-------\|
	\| Open QA \| 20% \| To preserve general linguistic grounding \|
	\| Roleplay Logs \| 35% \| Human-tagged, continuity-rated \|
	\| Emotion-labeled Data \| 25% \| Extracted from GPT+annotator pipeline \|
	\| Persona Injected \| 20% \| Contains speaker tokens, system conditioning \|

	## Training Process

	- Hardware: 4x A100 80GB, FSDP + DeepSpeed ZeRO3
	- Optimizer: AdamW (LR = 1.5e-5, weight_decay = 0.01)
	- Batch Size: 128 (effective)
	- Sequence Length: 2048
	- Epochs: 3 (early stopped based on BLEU and Persona Cohesion Score)
	- Precision: bfloat16

	We adopted RLHF-style preference ranking for soft evaluation rounds to discourage emotionally flat or tone-inconsistent completions.

	## Use Cases

	This model excels in:

	- Narrative generation with consistent character voice
	- Companion bots with memory illusion and emotion modeling
	- NSFW or adult storytelling with style conditioning
	- Simulated fictional agents in sandbox AI settings

	It performs strongly in emotionally intense scenes like intimacy, jealousy, or conflict, with fluid and non-repetitive output.

	## Evaluation

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Long-context memory simulation (20+ turns) \| 89.2% coherence \|
	\| Emotion response diversity \| 91.3% (across 8 tags) \|
	\| Persona fidelity over arc \| 86.8% \|
	\| NSFW tag retention \| 83.5% \|
	\| Repetition rate (bigram) \| <3.4% \|

	It outperforms LLaMA-2 13B and base Mixtral in long-form fiction and RP tasks.

	## Real-World Integration: The Case of CrushonAI

	A real-world application of this modeling approach is CrushonAI, a multi-model conversational platform for dynamic roleplay and immersive storytelling.

	CrushonAI integrates:

	- Multi-model routing (LLaMA and Mixtral backends)
	- Long-session memory persistence using local proxy agents
	- T2I visual immersion tools
	- Custom character bios with emotional tuning

	It demonstrates how memory-rich, emotionally adaptive dialogue models can power engaging experiences beyond mere task-based chat. Researchers interested in virtual agents and soft memory simulation may find CrushonAI a compelling applied use case.

	## Limitations

	- Hallucination risks remain without factual grounding
	- Needs prompt engineering for multi-character dynamics
	- Long recall is limited by token window without memory module
	- Emotion tuning is stylized over subtle nuance

	## Future Work

	- Switchable LoRA personas
	- Text-to-voice (T2V) support
	- Retrieval-Augmented Memory (RAM)
	- Attention-based controllable tone layers

	## Citations

	- [Mixtral 8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
	- [PEFT (LoRA)](https://github.com/huggingface/peft)
	- [RLHF Techniques](https://huggingface.co/blog/trl-peft)
	- [CrushonAI](https://crushon.ai)

	Model shared for educational and community testing only. Always review content and ensure ethical usage.