Mistral7b + SFT + 4bit DPO training with unalignment/toxic-dpo-v0.2 == ToxicMist? ☣🌫
Chat template
Files info
Base model