Update README.md
Browse files
README.md
CHANGED
@@ -136,17 +136,17 @@ ggc la
|
|
136 |
|
137 |
|rank|quant|s/it|loading speed|
|
138 |
|----|--------|---------|----------------|
|
139 |
-
| 1 | q2_k |
|
140 |
-
| 2 | q4_0 |
|
141 |
-
| 3 | q4_1 |
|
142 |
-
| 4 | q8_0 |
|
143 |
-
| 5 | q3_k |
|
144 |
-
| 6 | q5_0 |
|
145 |
-
| 7 | iq4_nl |
|
146 |
-
| 8 | q5_1 |
|
147 |
-
| 9 | iq4_xs |
|
148 |
-
| 10| iq3_s |
|
149 |
-
| 11| iq3_xxs|
|
150 |
|
151 |
not all included in the initial test (*tested with a beginner laptop gpu only, if you have highend model, might find q8_0 running surprisingly faster than others), the rest of them, test it yourself; btw, the interesting thing is: the loading time required was not aligning with file size, due to the complexity of each calculation (dequant), and might vary from model
|
152 |
|
|
|
136 |
|
137 |
|rank|quant|s/it|loading speed|
|
138 |
|----|--------|---------|----------------|
|
139 |
+
| 1 | q2_k | 6.40±.7 |🐖💨💨💨💨💨💨
|
140 |
+
| 2 | q4_0 | 8.58±.5 |🐖🐖💨💨💨💨💨
|
141 |
+
| 3 | q4_1 | 9.12±.5 |🐖🐖🐖💨💨💨💨
|
142 |
+
| 4 | q8_0 | 9.45±.3 |🐖🐖🐖🐖💨💨💨
|
143 |
+
| 5 | q3_k | 9.50±.3 |🐖🐖🐖🐖💨💨💨
|
144 |
+
| 6 | q5_0 | 10.48±.5|🐖🐖🐖🐖🐖💨💨
|
145 |
+
| 7 | iq4_nl | 10.55±.5|🐖🐖🐖🐖🐖💨💨
|
146 |
+
| 8 | q5_1 | 10.65±.5|🐖🐖🐖🐖🐖💨💨
|
147 |
+
| 9 | iq4_xs | 11.45±.7|🐖🐖🐖🐖🐖🐖💨
|
148 |
+
| 10| iq3_s | 11.62±.9|🐢🐢🐢🐢🐢🐢💨
|
149 |
+
| 11| iq3_xxs| 12.08±.9|🐢🐢🐢🐢🐢🐢🐢
|
150 |
|
151 |
not all included in the initial test (*tested with a beginner laptop gpu only, if you have highend model, might find q8_0 running surprisingly faster than others), the rest of them, test it yourself; btw, the interesting thing is: the loading time required was not aligning with file size, due to the complexity of each calculation (dequant), and might vary from model
|
152 |
|