cgus
/

Mistral-Small-3.1-DRAFT-0.5B-exl2

Text Generation

mistral-small-3.1

4-bit precision

Model card Files Files and versions Community

Mistral-Small-3.1-DRAFT-0.5B-exl2 / README.md

cgus's picture

Update README.md

d7fc577 verified 3 months ago

|

1.83 kB

	---
	license: apache-2.0
	language:
	- en
	- fr
	- de
	- es
	- it
	- pt
	base_model:
	- alamios/Mistral-Small-3.1-DRAFT-0.5B
	datasets:
	- alamios/Mistral-Small-24B-Instruct-2501-Conversations
	pipeline_tag: text-generation
	library_name: exllamav2
	tags:
	- qwen
	- qwen2.5
	- mistral
	- mistral-small
	- mistral-small-3.1
	---
	# Mistral-Small-3.1-DRAFT-0.5B-exl2
	Original model: [Mistral-Small-3.1-DRAFT-0.5B](https://huggingface.co/alamios/Mistral-Small-3.1-DRAFT-0.5B) by [alamios](https://huggingface.co/alamios)
	Based on: [Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) by [Qwen](https://huggingface.co/Qwen/Qwen2.5-0.5B)

	## Quants
	[4bpw h6 (main)](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/main)
	[5bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/5bpw-h6)
	[6bpw h6](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/6bpw-h6)
	[8bpw h8](https://huggingface.co/cgus/Mistral-Small-3.1-DRAFT-0.5B-exl2/tree/8bpw-h8)

	## Quantization notes
	Made with Exllamav2 with default dataset.
	These quants are meant to be used as a draft model for TabbyAPI.
	8bpw version with FP16 cache probably might be the most reliable option for this purpose.

	## Original model card
	# Mistral-Small-3.1-DRAFT-0.5B

	This model is meant to be used as draft model for speculative decoding with [mistralai/Mistral-Small-3.1-24B-Instruct-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503) or [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501)

	# Data info

	The data are Mistral's outputs and includes all kind of tasks from various datasets in English, French, German, Spanish, Italian and Portuguese. It has been trained for 2 epochs on 20k unique examples, for a total of 12 million tokens per epoch.