|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
- fr |
|
- de |
|
- es |
|
- it |
|
- pt |
|
base_model: |
|
- alamios/Mistral-Small-3.1-DRAFT-0.5B |
|
datasets: |
|
- alamios/Mistral-Small-24B-Instruct-2501-Conversations |
|
pipeline_tag: text-generation |
|
library_name: exllamav2 |
|
tags: |
|
- qwen |
|
- qwen2.5 |
|
- mistral |
|
- mistral-small |
|
- mistral-small-3.1 |
|
--- |
|
# Mistral-Small-3.1-DRAFT-0.5B-exl2 |
|
Original model: [Mistral-Small-3.1-DRAFT-0.5B](https://huggingface.co/alamios/Mistral-Small-3.1-DRAFT-0.5B) by [alamios](https://huggingface.co/alamios) |
|
Based on: [Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) by [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B) |
|
|
|
## Quants |
|
[4bpw h6 (main)](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/main) |
|
[5bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/5bpw-h6) |
|
[6bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/6bpw-h6) |
|
[8bpw h8](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/8bpw-h8) |
|
|
|
## Quantization notes |
|
Made with Exllamav2 with default dataset. |
|
These quants are meant to be used as a draft model for TabbyAPI. |
|
8bpw version with FP16 cache probably might be the most reliable option for this purpose. |
|
|
|
## Original model card |
|
# Mistral-Small-3.1-DRAFT-0.5B |
|
|
|
This model is meant to be used as draft model for speculative decoding with [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) or [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) |
|
|
|
# Data info |
|
|
|
The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch. |