ARMZyany commited on
Commit
d6706ea
·
verified ·
1 Parent(s): 5a294ab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -2,4 +2,7 @@
2
  license: apache-2.0
3
  base_model:
4
  - ARMZyany/Cascade0-159M-Instruct-45k
5
- ---
 
 
 
 
2
  license: apache-2.0
3
  base_model:
4
  - ARMZyany/Cascade0-159M-Instruct-45k
5
+ ---
6
+ F16 is the best option. F32 is just too slow (on a RTX3060M 6GB)
7
+ Q8_O is faster than F16 but does produce sometimes better results than F16
8
+ Under Q8 might be a big tradeoff in quality. Q3 showed some very, very bad hallucinations