Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,7 @@
|
|
1 |
---
|
2 |
base_model:
|
3 |
-
-
|
4 |
-
|
5 |
-
- TheDrummer/Rocinante-12B-v1.1
|
6 |
-
- MarinaraSpaghetti/NemoMix-Unleashed-12B
|
7 |
-
library_name: transformers
|
8 |
tags:
|
9 |
- mergekit
|
10 |
- merge
|
@@ -15,6 +12,29 @@ language:
|
|
15 |
- ru
|
16 |
- en
|
17 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
# NekoMix-12B
|
19 |
|
20 |

|
|
|
1 |
---
|
2 |
base_model:
|
3 |
+
- Moraliane/NekoMix-12B
|
4 |
+
library_name: exllamav2
|
|
|
|
|
|
|
5 |
tags:
|
6 |
- mergekit
|
7 |
- merge
|
|
|
12 |
- ru
|
13 |
- en
|
14 |
---
|
15 |
+
# NekoMix-12B-exl2
|
16 |
+
Original model: [NekoMix-12B](https://huggingface.co/Moraliane/NekoMix-12B) by [Moraliane](https://huggingface.co/Moraliane)
|
17 |
+
|
18 |
+
## Quants
|
19 |
+
[4bpw h6 (main)](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/main)
|
20 |
+
[4.5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/4.5bpw-h6)
|
21 |
+
[5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/5bpw-h6)
|
22 |
+
[6bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/6bpw-h6)
|
23 |
+
[8bpw h8](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/8bpw-h8)
|
24 |
+
|
25 |
+
## Quantization notes
|
26 |
+
Made with Exllamav2 0.2.8 with default dataset.
|
27 |
+
It seems to be primarily a Russian RP model. No clue how it performs at all.
|
28 |
+
It can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU on Windows or RTX/ROCm on Linux.
|
29 |
+
Exllamav2 doesn't support offloading to RAM, so make sure it fits your GPU. Otherwise use GGUF quants instead.
|
30 |
+
For example, with 12GB VRAM it can be used at 6bpw/Q6 cache at 16k context.
|
31 |
+
|
32 |
+
Эта модель может использоваться с TabbyAPI или Text-Generation-WebUI.
|
33 |
+
Для работы с ней требуется Nvidia RTX (Windows) или RTX/ROCm (Linux).
|
34 |
+
Exl2 формат требует, чтобы модель полностью помещалась в видеопамяти.
|
35 |
+
Например, с 12ГБ видеопамяти можно использовать 6bpw версию с Q6 кэшем с 16k контекстом.
|
36 |
+
|
37 |
+
# Original model card
|
38 |
# NekoMix-12B
|
39 |
|
40 |

|