Safetensors
mistral3
mistral
finetune
roleplay
chat
wings-of-fire
Darkhn's picture
Update README.md
3249c54 verified
metadata
license: other
license_name: mnpl
license_link: https://mistral.ai/static/licenses/MNPL-0.1.md
tags:
  - mistral
  - finetune
  - roleplay
  - chat
  - wings-of-fire
datasets:
  - Darkhn/WOF_QA_V2
  - Darkhn/WOF_Pretraining
  - Darkhn/WOF_V4_Combined_Dataset_deslopped_cleaned

Model Name - MS-3.1-24B-Animus-V4

Wings_of_Fire

Character Card & Lore Book

For the best roleplaying experience, it is highly recommended to use the provided character card and lore book, which have been updated for V4. These files help guide the model's persona and provide rich, in-universe context.

Download the Character Card and Lore Book here

Model Description

This is Version 4 of the fine-tuned mistralai/Mistral-Small-3.1-24B-Instruct-2503, specialized for roleplaying and instruction-following within the Wings of Fire universe. V4 builds upon its predecessors with a refined training methodology focused on data quality and character consistency, resulting in a more coherent and immersive roleplaying experience.

The model was first adapted on a highly cleaned dataset extracted from the Wings of Fire book series. It was then fine-tuned for 2 epochs on a smaller, more potent dataset designed to teach character persistence and advanced roleplay.

The goal of this model is to provide a high-quality, immersive, and lore-accurate conversational experience. It can adopt character personas, answer questions about the world, engage in creative storytelling, portray multiple characters at once, and handle more mature themes from the series with improved logical consistency.

Training Details

Training Hardware

The model was fine-tuned on a single NVIDIA H100 GPU.

Training Procedure

A QLoRA (Quantized Low-Rank Adaptation) approach was used for efficient fine-tuning, with an optimized process configured using Axolotl.

Chat Template

This model uses the Mistral_V7_tekken chat template. It is crucial to format your prompts using this template for the model to function as intended.

Training Data

The training process involved two main stages with a strong emphasis on data quality:

  1. Refined Domain Adaptation (Pre-training): The base model was adapted using the Darkhn/WOF_Pretraining dataset. For V4, this dataset was meticulously cleaned to remove extraneous information such as chapter markers, image links (.jpg), and other formatting artifacts. This provides a purer textual foundation for the model to learn from.

  2. Instruction & Chat Fine-tuning: The model was fine-tuned for 2 epochs on a concentrated, high-quality dataset.

    • Quality over Quantity: The roleplay dataset was "deslopped"—curated down from 2,700 to 1,500 elite roleplay examples. This ensures the model learns from the best possible interactions.
    • Character Persistence Workflow: A key improvement in V4 is the new dataset generation process. Instead of creating new, isolated characters for each example, characters are reused across multiple scenarios. This method trains the model to maintain a character's personality, traits, and history consistently, significantly reducing the logical mistakes and contradictions seen in previous versions.
    • The dataset continues to feature multi-turn scenarios, portrayal of multiple characters, and the more mature or 'darker' themes present in the book series.

Intended Use & Limitations

  • Intended Use: This model is intended for creative and roleplaying purposes within the Wings of Fire universe. It is designed for fans of the series and is not a general-purpose chatbot.

  • Limitations & Quirks:

    • Performance on tasks outside of its training domain (general knowledge, coding, etc.) is not guaranteed and will likely be poor.
    • Character Consistency: While all models can have logical lapses, V4's training was specifically designed to improve character persistence. It should be noticeably better at remembering character traits and history across a conversation compared to previous versions.
    • The model may "hallucinate" or generate plausible but non-canonical information.
    • Content: The roleplay training data includes more mature and darker themes from the Wings of Fire series, such as character death, conflict, and moral ambiguity. The model is capable of generating content reflecting these themes. It can generate gratuitous or explicit content, as always it's up to the user what they do with it.
    • Formatting: The training data was cleaned to remove formatting artifacts like asterisks (*...*) for single word emphasis. The model should now produce cleaner, more narrative-style prose.
    • Safety: This model has not undergone additional safety alignment beyond what was included in its base Mistral-Small-3.1 model. Standard responsible AI practices should be followed.

Recommended Sampler Settings

For optimal performance that balances creativity and coherence, the following default sampler settings are recommended.

  • Temperature: 0.7
  • Min_P: 0.035
  • DRY Sampler:
    • Multiplier: 0.8
    • Allowed Length: 4
    • Base: 1.75

Acknowledgements

  • Credit to Mistral AI for the powerful Mistral-Small-3.1 architecture.
  • Credit to Evan Armstrong for Augmentoolkit to generate the dataset.