justinthelaw commited on
Commit
5157a50
1 Parent(s): 74a622c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -39,9 +39,11 @@ This repo contains GPTQ 4-bit, 32g Group Size, quantized model files from the No
39
 
40
  Models are released as sharded safetensors files.
41
 
42
- | Bits | GS | GPTQ Dataset | Seq Len | Size |
43
- | ---- | -- | ----------- | ------- | ---- |
44
- | 4 | 32 | [VMWare Open Instruct](https://huggingface.co/datasets/vmware/open-instruct) | 1,024 | 4.57 GB
 
 
45
 
46
  <!-- README_GPTQ.md-provided-files end -->
47
 
 
39
 
40
  Models are released as sharded safetensors files.
41
 
42
+ | Bits | GS | GPTQ Dataset | Max Seq Len | Size | VRAM |
43
+ | ---- | -- | ----------- | ------- | ---- | ---- |
44
+ | 4 | 32 | [VMWare Open Instruct](https://huggingface.co/datasets/vmware/open-instruct) | 32,768 | 4.57 GB | 19-23 Gb*
45
+
46
+ * Depends on maximum sequence length parameter (KV cache utilization) used with vLLM or Transformers
47
 
48
  <!-- README_GPTQ.md-provided-files end -->
49