--- license: apache-2.0 language: - en - fr - de - es - it - pt base_model: - alamios/Mistral-Small-3.1-DRAFT-0.5B datasets: - alamios/Mistral-Small-24B-Instruct-2501-Conversations pipeline_tag: text-generation library_name: exllamav2 tags: - qwen - qwen2.5 - mistral - mistral-small - mistral-small-3.1 --- # Mistral-Small-3.1-DRAFT-0.5B-exl2 Original model: [Mistral-Small-3.1-DRAFT-0.5B](https://huggingface.co/alamios/Mistral-Small-3.1-DRAFT-0.5B) by [alamios](https://huggingface.co/alamios) Based on: [Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) by [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B) ## Quants [4bpw h6 (main)](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/main) [5bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/5bpw-h6) [6bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/6bpw-h6) [8bpw h8](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/8bpw-h8) ## Quantization notes Made with Exllamav2 with default dataset. These quants are meant to be used as a draft model for TabbyAPI. 8bpw version with FP16 cache probably might be the most reliable option for this purpose. ## Original model card # Mistral-Small-3.1-DRAFT-0.5B This model is meant to be used as draft model for speculative decoding with [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) or [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501) # Data info The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch.