cgus commited on
Commit
9b22287
·
verified ·
1 Parent(s): 8c9db2d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -5
README.md CHANGED
@@ -1,10 +1,7 @@
1
  ---
2
  base_model:
3
- - IlyaGusev/saiga_nemo_12b
4
- - Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24
5
- - TheDrummer/Rocinante-12B-v1.1
6
- - MarinaraSpaghetti/NemoMix-Unleashed-12B
7
- library_name: transformers
8
  tags:
9
  - mergekit
10
  - merge
@@ -15,6 +12,29 @@ language:
15
  - ru
16
  - en
17
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  # NekoMix-12B
19
 
20
  ![NekoMix-12B](./remix.webp)
 
1
  ---
2
  base_model:
3
+ - Moraliane/NekoMix-12B
4
+ library_name: exllamav2
 
 
 
5
  tags:
6
  - mergekit
7
  - merge
 
12
  - ru
13
  - en
14
  ---
15
+ # NekoMix-12B-exl2
16
+ Original model: [NekoMix-12B](https://huggingface.co/Moraliane/NekoMix-12B) by [Moraliane](https://huggingface.co/Moraliane)
17
+
18
+ ## Quants
19
+ [4bpw h6 (main)](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/main)
20
+ [4.5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/4.5bpw-h6)
21
+ [5bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/5bpw-h6)
22
+ [6bpw h6](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/6bpw-h6)
23
+ [8bpw h8](https://huggingface.co/cgus/NekoMix-12B-exl2/tree/8bpw-h8)
24
+
25
+ ## Quantization notes
26
+ Made with Exllamav2 0.2.8 with default dataset.
27
+ It seems to be primarily a Russian RP model. No clue how it performs at all.
28
+ It can be used with TabbyAPI or Text-Generation-WebUI with RTX GPU on Windows or RTX/ROCm on Linux.
29
+ Exllamav2 doesn't support offloading to RAM, so make sure it fits your GPU. Otherwise use GGUF quants instead.
30
+ For example, with 12GB VRAM it can be used at 6bpw/Q6 cache at 16k context.
31
+
32
+ Эта модель может использоваться с TabbyAPI или Text-Generation-WebUI.
33
+ Для работы с ней требуется Nvidia RTX (Windows) или RTX/ROCm (Linux).
34
+ Exl2 формат требует, чтобы модель полностью помещалась в видеопамяти.
35
+ Например, с 12ГБ видеопамяти можно использовать 6bpw версию с Q6 кэшем с 16k контекстом.
36
+
37
+ # Original model card
38
  # NekoMix-12B
39
 
40
  ![NekoMix-12B](./remix.webp)