|
|
--- |
|
|
base_model: |
|
|
- allura-org/MS3.2-24b-Angel |
|
|
library_name: transformers |
|
|
thumbnail: >- |
|
|
https://cdn-uploads.huggingface.co/production/uploads/634262af8d8089ebaefd410e/g6hHxcdrD8r-HSUAz9b89.png |
|
|
tags: |
|
|
- axolotl |
|
|
- unsloth |
|
|
- roleplay |
|
|
- conversational |
|
|
datasets: |
|
|
- PygmalionAI/PIPPA |
|
|
- Alfitaria/nemotron-ultra-reasoning-synthkink |
|
|
- PocketDoc/Dans-Prosemaxx-Gutenberg |
|
|
- FreedomIntelligence/Medical-R1-Distill-Data |
|
|
- cognitivecomputations/SystemChat-2.0 |
|
|
- allenai/tulu-3-sft-personas-instruction-following |
|
|
- kalomaze/Opus_Instruct_25k |
|
|
- simplescaling/s1K-claude-3-7-sonnet |
|
|
- ai2-adapt-dev/flan_v2_converted |
|
|
- grimulkan/theory-of-mind |
|
|
- grimulkan/physical-reasoning |
|
|
- nvidia/HelpSteer3 |
|
|
- nbeerbower/gutenberg2-dpo |
|
|
- nbeerbower/gutenberg-moderne-dpo |
|
|
- nbeerbower/Purpura-DPO |
|
|
- antiven0m/physical-reasoning-dpo |
|
|
- allenai/tulu-3-IF-augmented-on-policy-70b |
|
|
- allenai/href |
|
|
--- |
|
|
# MLX format for Angel 24b |
|
|
Get em while they're hot. |
|
|
|
|
|
This one is at Q8 quality. Vision stack appears to be mangled somewhat, sadly. |
|
|
|
|
|
|
|
|
# Angel 24b |
|
|
|
|
|
 |
|
|
|
|
|
***Better to reign in Hell than serve in Heaven.*** |
|
|
|
|
|
# Overview |
|
|
MS3.2-24b-Angel is a model finetuned from Mistral Small 3.2 for roleplaying, storywriting, and differently-flavored general instruct usecases. |
|
|
|
|
|
Testing revealed strong prose and character portrayal for its class, rivalling the preferred 72B models of some testers. |
|
|
|
|
|
# Quantizations |
|
|
EXL3: |
|
|
- [Official EXL3 quants](https://huggingface.co/allura-quants/allura-org_MS3.2-24b-Angel-EXL3) (thanks artus <3) |
|
|
|
|
|
GGUF: |
|
|
- [Official GGUF imatrix quants w/ mmproj](https://hf.co/allura-quants/allura-org_MS3.2-24b-Angel-GGUF) (thanks artus, again <3) |
|
|
|
|
|
MLX: |
|
|
- TODO! :3 |
|
|
|
|
|
# Usage |
|
|
- Use Mistral v7 Tekken. |
|
|
- It is **highly recommended** (if your framework supports it) to use the official Mistral tokenization code instead of Huggingface's. This is possible in vLLM with `--tokenizer-mode mistral`. |
|
|
- Recommended samplers (from CURSE and corroborated by me, Fizz) are 1.2 temperature, 0.1 min_p, and 1.05 repetition penalty. |
|
|
- We recommend *a* system prompt, but its contents only faintly matter (I accidentally had an assistant system prompt during the entire time I was testing) |
|
|
|
|
|
# Training Process |
|
|
1. [The original model had its vision adapter removed](https://huggingface.co/anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only) for better optimization and easier usage in training frameworks |
|
|
2. The model was then put through an SFT process (using Axolotl) on various sources of general instruct, storytelling, and RP data, which resulted in [allura-forge/ms32-sft-merged](https://hf.co/allura-forge/ms32-sft-merged). |
|
|
3. Afterwards, the model was put through a KTO process (using Unsloth) on more focused storywriting and anti-slop data, as well as general instruction following and human preference, which resulted in the final checkpoints at [allura-forge/ms32-final-TEXTONLY](https://hf.co/allura-forge/ms32-final-TEXTONLY). |
|
|
4. Finally, the vision tower was manually added back to the weights to continue to support multimodality. |
|
|
|
|
|
# Credits |
|
|
- Fizz - training and data wrangling |
|
|
- Artus (by proxy) & Bot - help with funding |
|
|
- CURSE - testing |
|
|
- Mango - testing, data, help with KTO configs |
|
|
- DoctorShotgun - making the original text-only model |
|
|
- Axolotl & Unsloth - creating the training frameworks used for parts of this finetune |
|
|
- Everyone in Allura - moral support, being cool |
|
|
- Vivziepop and co - Angel Dust |
|
|
|
|
|
<3 love you all |