rohitg
/

Mixtral-8x22B-Instruct-v0.1-hf-4bit_g64-HQQ

Model card Files Files and versions

rohitg commited on Apr 23, 2024

Commit

510ad92

·

verified ·

1 Parent(s): 928bfc4

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -13,9 +13,11 @@ from hqq.utils.patching import prepare_for_inference
 ### Loading Weights
 model = HQQModelForCausalLM.from_quantized("rohitg/Mixtral-8x22B-Instruct-v0.1-hf-4bit_g64-HQQ", device='cuda')
 tokenizer = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1')
 prepare_for_inference(model, backend="torchao_int4")
 ### Text Generation

 ### Loading Weights
+```python
 model = HQQModelForCausalLM.from_quantized("rohitg/Mixtral-8x22B-Instruct-v0.1-hf-4bit_g64-HQQ", device='cuda')
 tokenizer = AutoTokenizer.from_pretrained('mistralai/Mixtral-8x22B-Instruct-v0.1')
 prepare_for_inference(model, backend="torchao_int4")
+```
 ### Text Generation