shimmyshimmer commited on
Commit
73a9be5
·
verified ·
1 Parent(s): 522969f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -2
README.md CHANGED
@@ -27,13 +27,16 @@ library_name: transformers
27
  </div>
28
  <h1 style="margin-top: 0rem;">🌙 Kimi K2 Usage Guidelines</h1>
29
  </div>
 
 
 
 
30
  It is recommended to have at least 128GB unified RAM memory to run the small quants. With 16GB VRAM and 256 RAM, expect 5+ tokens/sec.
31
  For best results, use any 2-bit XL quant or above.
32
 
33
  Set the temperature to 0.6 recommended) to reduce repetition and incoherence.
34
 
35
- - Use llama.cpp's [PR #14654](https://github.com/ggml-org/llama.cpp/pull/14654) or [our llama.cpp fork](https://github.com/unslothai/llama.cpp) (easier to work)
36
- - For complete detailed instructions, see our guide: [docs.unsloth.ai/basics/kimi-k2](https://docs.unsloth.ai/basics/kimi-k2)
37
 
38
  <div align="center">
39
  <picture>
 
27
  </div>
28
  <h1 style="margin-top: 0rem;">🌙 Kimi K2 Usage Guidelines</h1>
29
  </div>
30
+
31
+ - To run, you must use llama.cpp [PR #14654](https://github.com/ggml-org/llama.cpp/pull/14654) or [our llama.cpp fork](https://github.com/unslothai/llama.cpp) (easier)
32
+ - For complete detailed instructions, see our guide: [docs.unsloth.ai/basics/kimi-k2](https://docs.unsloth.ai/basics/kimi-k2)
33
+
34
  It is recommended to have at least 128GB unified RAM memory to run the small quants. With 16GB VRAM and 256 RAM, expect 5+ tokens/sec.
35
  For best results, use any 2-bit XL quant or above.
36
 
37
  Set the temperature to 0.6 recommended) to reduce repetition and incoherence.
38
 
39
+ ---
 
40
 
41
  <div align="center">
42
  <picture>