### alpaca.cpp 65B ggml model weight | |
### make 65B ggml story | |
#### 1. clone 65B model data | |
```shell | |
git clone https://huggingface.co/datasets/nyanko7/LLaMA-65B/ | |
``` | |
#### 2. clone alpaca.cpp | |
```shell | |
git clone https://github.com/antimatter15/alpaca.cpp | |
``` | |
#### 3. weight quantize.sh | |
```shell | |
mv LLaMA-65B/tokenizer.model ./ | |
python convert-pth-to-ggml.py ../LLaMA-65B/ 1 | |
cd alpaca.cpp | |
mkdir -p models/65B | |
mv ../LLaMA-65B/ggml-model-f16.bin models/65B/ | |
mv ../LLaMA-65B/ggml-model-f16.bin.* models/65B/ | |
bash quantize.sh 65B | |
``` | |
#### 4. upload weight file | |
##### Upload is slower. The upload is taking almost 2 days, I decided to curve the upload | |
##### I using https://tmp.link/ as temp store | |
##### I using colab and huggingface api upload | |
### run | |
```shell | |
git clone https://github.com/antimatter15/ | |
./chat -m alpaca.cpp_65b_ggml/ggml-model-q4_0.bin | |
``` | |