Update README.md
Browse files
README.md
CHANGED
@@ -4,6 +4,16 @@ metrics: null
|
|
4 |
|
5 |
Quantized Meta AI's [LLaMA](https://arxiv.org/abs/2302.13971) in 4bit with the help of [GPTQ](https://arxiv.org/abs/2210.17323v2) algorithm v2.
|
6 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/841feedde876785bc8022ca48fd9c3ff626587e2
|
8 |
|
9 |
**Note:** This model will fail to load with current GPTQ-for-LLaMa implementation
|
@@ -11,4 +21,4 @@ GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/841feed
|
|
11 |
Conversion process
|
12 |
```sh
|
13 |
CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-13b c4 --wbits 4 --true-sequential --act-order --save_safetensors ./q4/llama13b-4bit-v2.safetensors
|
14 |
-
```
|
|
|
4 |
|
5 |
Quantized Meta AI's [LLaMA](https://arxiv.org/abs/2302.13971) in 4bit with the help of [GPTQ](https://arxiv.org/abs/2210.17323v2) algorithm v2.
|
6 |
|
7 |
+
- [**llama13b-4bit-ts-ao-g128-v2.safetensors**](https://huggingface.co/sardukar/llama13b-4bit-v2/blob/main/llama13b-4bit-ts-ao-g128-v2.safetensors)
|
8 |
+
GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/49efe0b67db4b40eac2ae963819ebc055da64074
|
9 |
+
|
10 |
+
Conversion process:
|
11 |
+
```sh
|
12 |
+
CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-13b c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ./q4/llama13b-4bit-ts-ao-g128-v2.safetensors
|
13 |
+
```
|
14 |
+
|
15 |
+
|
16 |
+
- [llama13b-4bit-v2.safetensors](https://huggingface.co/sardukar/llama13b-4bit-v2/blob/main/llama13b-4bit-v2.safetensors)
|
17 |
GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/841feedde876785bc8022ca48fd9c3ff626587e2
|
18 |
|
19 |
**Note:** This model will fail to load with current GPTQ-for-LLaMa implementation
|
|
|
21 |
Conversion process
|
22 |
```sh
|
23 |
CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-13b c4 --wbits 4 --true-sequential --act-order --save_safetensors ./q4/llama13b-4bit-v2.safetensors
|
24 |
+
```
|