jeffcookio commited on
Commit
9688af9
·
verified ·
1 Parent(s): b37d7c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -5,4 +5,6 @@ base_model:
5
 
6
  This is a Mistral-Small-3.1-24B-Instruct-2503 quantized from a hacked-up GPTQModel that has preliminary `Mistral3ForConditionalGeneration` support. There were several weird changes. Calibration was run against the `flickr30k` dataset (with too few samples; may upload a version with more significant calibration soon), and thus this should be a true vision-aware quant of the Mistral Small 3.1 HF checkpoint.
7
 
 
 
8
  Another "feature" of this version is that it was quantized with a preliminary implementation of block-diagonal Hessians (which was authored entirely by Grok3). This allowed me to compute the quantization without OOM in my 24G VRAM.
 
5
 
6
  This is a Mistral-Small-3.1-24B-Instruct-2503 quantized from a hacked-up GPTQModel that has preliminary `Mistral3ForConditionalGeneration` support. There were several weird changes. Calibration was run against the `flickr30k` dataset (with too few samples; may upload a version with more significant calibration soon), and thus this should be a true vision-aware quant of the Mistral Small 3.1 HF checkpoint.
7
 
8
+ You need this branch of vLLM to run: https://github.com/sjuxax/vllm/tree/Mistral3.1
9
+
10
  Another "feature" of this version is that it was quantized with a preliminary implementation of block-diagonal Hessians (which was authored entirely by Grok3). This allowed me to compute the quantization without OOM in my 24G VRAM.