cgus commited on
Commit
23ea45a
·
verified ·
1 Parent(s): b8e77df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -4
README.md CHANGED
@@ -1,13 +1,51 @@
1
  ---
2
  base_model:
3
- - davidkim205/komt-solar-10.7b-sft-v5
4
- - colable/LDCC-CCK-slerp
5
- library_name: transformers
6
  tags:
7
  - mergekit
8
  - merge
9
  license: mit
 
 
 
10
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  # free-solar-0.3
12
 
13
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
@@ -21,4 +59,4 @@ This model was merged using the SLERP merge method.
21
 
22
  The following models were included in the merge:
23
  * [davidkim205/komt-solar-10.7b-sft-v5](https://huggingface.co/davidkim205/komt-solar-10.7b-sft-v5)
24
- * [colable/LDCC-CCK-slerp](https://huggingface.co/davidkim205/colable/LDCC-CCK-slerp)
 
1
  ---
2
  base_model:
3
+ - freewheelin/free-solar-slerp-v0.3
 
 
4
  tags:
5
  - mergekit
6
  - merge
7
  license: mit
8
+ inference: false
9
+ language:
10
+ - ko
11
  ---
12
+ # free-solar-0.3-exl2
13
+ Original model: [free-solar-slerp-v0.3](https://huggingface.co/freewheelin/free-solar-slerp-v0.3)
14
+ Model creator: [freewheelin](https://huggingface.co/freewheelin/)
15
+
16
+ # Quants
17
+ [4bpw h6 (main)](https://huggingface.co/cgus/free-solar-slerp-v0.3/tree/main)
18
+ [4.25bpw h6](https://huggingface.co/cgus/free-solar-slerp-v0.3/tree/4.25bpw-h6)
19
+ [4.65bpw h6](https://huggingface.co/cgus/free-solar-slerp-v0.3/tree/4.65bpw-h6)
20
+ [5bpw h6](https://huggingface.co/cgus/free-solar-slerp-v0.3/tree/5bpw-h6)
21
+ [6bpw h6](https://huggingface.co/cgus/free-solar-slerp-v0.3/tree/6bpw-h6)
22
+ [8bpw h8](https://huggingface.co/cgus/free-solar-slerp-v0.3/tree/8bpw-h8)
23
+
24
+ # Quantization notes
25
+ Made with Exllamav2 0.0.15 with the default dataset.
26
+ This model has unusually long loading times, normal 11B models take about 30s on my PC but this one loads in 130s.
27
+ I have no clue why it has such long loading times, maybe because it was originally FP32 instead of usual FP16.
28
+ But overall VRAM usage and generation speed seems to be rather normal.
29
+
30
+ This seems to be primarily a Korean language model.
31
+ I didn't realize it at first when I tried it since the language wasn't explictly listed.
32
+ I'm unable to evaluate it in its main area but it seems to be usable in English and to some degree in other languages.
33
+ When using it in English, sometimes it seems to randomly switch topics or starts writing in Korean as if it occasionally forgets to write the stopping token.
34
+ But it has an interesting writing style in English and overall seems to be quite reasonable, so I dedided to make full quants.
35
+ And I was curious to try to quantize a FP32 model, RAM requirements were higher but overall process went smooth without any issues.
36
+
37
+ ## How to run
38
+
39
+ This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
40
+
41
+ [Text Generation Webui](https://github.com/oobabooga/text-generation-webui)
42
+
43
+ [KoboldAI](https://github.com/henk717/KoboldAI)
44
+
45
+ [ExUI](https://github.com/turboderp/exui)
46
+
47
+
48
+ # Original model card
49
  # free-solar-0.3
50
 
51
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
59
 
60
  The following models were included in the merge:
61
  * [davidkim205/komt-solar-10.7b-sft-v5](https://huggingface.co/davidkim205/komt-solar-10.7b-sft-v5)
62
+ * [colable/LDCC-CCK-slerp](https://huggingface.co/davidkim205/colable/LDCC-CCK-slerp)