---
language:
- en
- fr
- de
- es
- pt
- it
- ja
- ko
- ru
- zh
- ar
- fa
- id
- ms
- ne
- pl
- ro
- sr
- sv
- tr
- uk
- vi
- hi
- bn
tags:
- Magistral-Small-2506
- bfloat16
- thinking
- reasoning
- all use cases
- creative use cases
- creative
- creative writing
- fiction writing
- plot generation
- sub-plot generation
- fiction writing
- story generation
- scene continue
- storytelling
- fiction story
- science fiction
- romance
- all genres
- story
- writing
- vivid prose
- vivid writing
- fiction
- roleplaying
- bfloat16
- swearing
- rp
- NEO Imatrix
- GGUFs
- Maxed Output Tensor
license: apache-2.0
base_model:
- mistralai/Mistral-Small-3.1-24B-Instruct-2503
---

(quants uploading, examples to be added, model card updates to follow.)

<h2>Magistral-Small-2506-Reasoning-24B-NEO-MAX-Imatrix-GGUF</h2>

NEO Imatrix quants, with MAX Output tensor (bf16 // full precision) to improve reasoning / output generation
of Mistral's new reasoning model "Magistral-Small-2506": 

https://huggingface.co/mistralai/Magistral-Small-2506/

About these GGUFS:
- Quanted using NEO Imatrix Dataset
- The Output Tensor is set at BF16 / 16 bit full precision.
- Correct Jijna template which includes "System Prompt" embedded for reasoning.
- 32K / 32,768 context max (default/set at org repo)
- Suggest min context of 4k-8k for reasoning/output.- 

An additional repo of GGUFs set at 128k / 131,072 context will follow, as per Mistrals notes that the model
was trained at 128k max context.

Please see notes at:

https://huggingface.co/mistralai/Magistral-Small-2506/

For temp, topk, top p and other suggested parameter settings.

Special thanks to "MLX-Community" for correct config/tokenizer files.

https://huggingface.co/mlx-community/Magistral-Small-2506-bf16

---