CRD716 commited on
Commit
a9dfd2e
·
1 Parent(s): 6c9f3f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -28,14 +28,13 @@ language:
28
  - sr
29
  - sv
30
  - uk
31
- library_name: adapter-transformers
32
  ---
33
 
34
  LLaMa 65B converted to ggml via LLaMa.cpp, then quantized to 4bit.
35
 
36
- Note: If you previously used the q4_0 model before April 26th, 2023, you are using an outdated model. I suggest redownloading for a better experience.
37
- Check https://github.com/ggerganov/llama.cpp#quantization for details on the different quantization types.
38
 
39
- I recommend the following settings when running as a good starting point: ```main.exe -m ggml-LLaMa-65B-q4_0.bin -n -1 -t 42 -c 2048 --temp 0.4 --interactive-first --repeat_penalty 1.2 --color```
 
40
 
41
  Be aware that LLaMa is a text generation model, not a conversational one, and as such you will have to prompt it differently than, for example, Vicuna or ChatGPT.
 
28
  - sr
29
  - sv
30
  - uk
 
31
  ---
32
 
33
  LLaMa 65B converted to ggml via LLaMa.cpp, then quantized to 4bit.
34
 
35
+ Legacy is for llama.cpp setups older than https://github.com/ggerganov/llama.cpp/pull/1405, the regular is faster but does not work on old versions.
 
36
 
37
+ I recommend the following settings when running as a good starting point:
38
+ ```main.exe -m ggml-LLaMa-65B-q4_0.bin -n -1 -t 32 -c 2048 --temp 0.7 --repeat_penalty 1.2 --mirostat 2 --interactive-first --color```
39
 
40
  Be aware that LLaMa is a text generation model, not a conversational one, and as such you will have to prompt it differently than, for example, Vicuna or ChatGPT.