Thanks for IQ4_NL

#1
by KeyboardMasher - opened

Few people do this quantization. It is very quick on ARM64 CPU because it benefits from "repack" as Q4_0 does.
Good for running small LLMs on Android phone.

Sign up or log in to comment