File size: 1,825 Bytes
2457489 d7fc577 2457489 d7fc577 2457489 d7fc577 2457489 d7fc577 2457489 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: apache-2.0
language:
- en
- fr
- de
- es
- it
- pt
base_model:
- alamios/Mistral-Small-3.1-DRAFT-0.5B
datasets:
- alamios/Mistral-Small-24B-Instruct-2501-Conversations
pipeline_tag: text-generation
library_name: exllamav2
tags:
- qwen
- qwen2.5
- mistral
- mistral-small
- mistral-small-3.1
---
# Mistral-Small-3.1-DRAFT-0.5B-exl2
Original model: [Mistral-Small-3.1-DRAFT-0.5B](https://huggingface.co/alamios/Mistral-Small-3.1-DRAFT-0.5B) by [alamios](https://huggingface.co/alamios)
Based on: [Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) by [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B)
## Quants
[4bpw h6 (main)](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/main)
[5bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/5bpw-h6)
[6bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/6bpw-h6)
[8bpw h8](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/8bpw-h8)
## Quantization notes
Made with Exllamav2 with default dataset.
These quants are meant to be used as a draft model for TabbyAPI.
8bpw version with FP16 cache probably might be the most reliable option for this purpose.
## Original model card
# Mistral-Small-3.1-DRAFT-0.5B
This model is meant to be used as draft model for speculative decoding with [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) or [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501)
# Data info
The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch. |