metadata

license: apache-2.0
language:
  - en
  - fr
  - de
  - es
  - it
  - pt
base_model:
  - alamios/Mistral-Small-3.1-DRAFT-0.5B
datasets:
  - alamios/Mistral-Small-24B-Instruct-2501-Conversations
pipeline_tag: text-generation
library_name: exllamav2
tags:
  - qwen
  - qwen2.5
  - mistral
  - mistral-small
  - mistral-small-3.1

Mistral-Small-3.1-DRAFT-0.5B-exl2

Original model: Mistral-Small-3.1-DRAFT-0.5B by alamios
Based on: Qwen2.5-0.5B by Qwen

Quants

4bpw h6 (main)
5bpw h6 6bpw h6 8bpw h8

Quantization notes

Made with Exllamav2 with default dataset.
These quants are meant to be used as a draft model for TabbyAPI.
8bpw version with FP16 cache probably might be the most reliable option for this purpose.

Original model card

Mistral-Small-3.1-DRAFT-0.5B

This model is meant to be used as draft model for speculative decoding with mistralai/Mistral-Small-3.1-24B-Instruct-2503 or mistralai/Mistral-Small-24B-Instruct-2501

Data info

The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch.