|
This is the quantized version 4-bits created using autotrain, but it doesn't work. |
|
|
|
## Error |
|
### GPU |
|
|
|
 |
|
|
|
### CPU |
|
|
|
 |
|
|
|
|
|
## Quantization Process |
|
|
|
```py |
|
!pip install auto-gptq |
|
!pip install git+https://github.com/huggingface/optimum.git |
|
!pip install git+https://github.com/huggingface/transformers.git |
|
!pip install --upgrade accelerate |
|
``` |
|
|
|
```py |
|
from transformers import AutoModelForCausalLM, AutoTokenizer,GPTQConfig |
|
tokenizer = AutoTokenizer.from_pretrained("inception-mbzuai/jais-13b-chat") |
|
gptq_config = GPTQConfig(bits=4, dataset = "c4", tokenizer=tokenizer) |
|
model = AutoModelForCausalLM.from_pretrained('inception-mbzuai/jais-13b-chat', quantization_config=gptq_config,trust_remote_code=True) |
|
``` |
|
|
|
|