README.md · DavidAU/Magistral-Small-2506-Reasoning-24B-NEO-MAX-Imatrix-GGUF at 1b48c40a8237126f7b0ca51adafbd4af5c2cc40e

metadata

language:
  - en
  - fr
  - de
  - es
  - pt
  - it
  - ja
  - ko
  - ru
  - zh
  - ar
  - fa
  - id
  - ms
  - ne
  - pl
  - ro
  - sr
  - sv
  - tr
  - uk
  - vi
  - hi
  - bn
tags:
  - Magistral-Small-2506
  - bfloat16
  - thinking
  - reasoning
  - all use cases
  - creative use cases
  - creative
  - creative writing
  - fiction writing
  - plot generation
  - sub-plot generation
  - fiction writing
  - story generation
  - scene continue
  - storytelling
  - fiction story
  - science fiction
  - romance
  - all genres
  - story
  - writing
  - vivid prose
  - vivid writing
  - fiction
  - roleplaying
  - bfloat16
  - swearing
  - rp
  - NEO Imatrix
  - GGUFs
  - Maxed Output Tensor
license: apache-2.0
base_model:
  - mistralai/Mistral-Small-3.1-24B-Instruct-2503

(quants uploading, examples to be added, model card updates to follow.)

Magistral-Small-2506-Reasoning-24B-NEO-MAX-Imatrix-GGUF

NEO Imatrix quants, with MAX Output tensor (bf16 // full precision) to improve reasoning / output generation of Mistral's new reasoning model "Magistral-Small-2506":

https://huggingface.co/mistralai/Magistral-Small-2506/

About these GGUFS:

Quanted using NEO Imatrix Dataset
The Output Tensor is set at BF16 / 16 bit full precision.
Correct Jijna template which includes "System Prompt" embedded for reasoning.
32K / 32,768 context max (default/set at org repo)
Suggest min context of 4k-8k for reasoning/output.-

An additional repo of GGUFs set at 128k / 131,072 context will follow, as per Mistrals notes that the model was trained at 128k max context.

Please see notes at:

https://huggingface.co/mistralai/Magistral-Small-2506/

For temp, topk, top p and other suggested parameter settings.

Special thanks to "MLX-Community" for correct config/tokenizer files.

https://huggingface.co/mlx-community/Magistral-Small-2506-bf16