能不能出一个gguf格式量化版，例如nf4？

by manbaout - opened Jul 24

Jul 24

使用fp8版本的14b模型对于16gvram+32gram设备很不友好，这会出现内存不足的问题，如果出一个低精度的量化(例如q4、nf4一类)将会有助于带来更高的可用性

Sep 2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment