File size: 950 Bytes
82100c8
 
 
 
 
c8a9a3d
82100c8
c8a9a3d
82100c8
998a974
c8a9a3d
 
 
 
998a974
 
 
 
 
 
 
 
 
 
 
 
 
74ade01
 
998a974
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
library_name: transformers
tags: []
---

This is a quantized version of the Jais-13b-chat model

To load this model you will need the bitsandbytes quantization method

If you are using text-generator-webui Select Transformers
- Compute d-type: bfloat16
- Quantization Type : nf4
- Load in 4-bit: True
- Use double quantization: True



```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import transformers
import torch

model_name = "jwnder/core42_jais-13b-chat-bnb-4bit"

import warnings
warnings.filterwarnings('ignore')

tokenizer = AutoTokenizer.from_pretrained(model_input_folder, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_input_folder, trust_remote_code=True)

inputs = tokenizer("Testing LLM!", return_tensors="pt")
start = datetime.now()
outputs = model.generate(**inputs)
end = datetime.now()
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))

```