Qwen
/

Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4

Text Generation

4-bit precision

Model card Files Files and versions

Resources

View closed (0)

Int4为什么比没量化的float32和float16还慢

#3 opened 3 months ago by

Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4 模型加载时间过长（近 2 小时）

#2 opened 11 months ago by

Not working with sample code

#1 opened about 1 year ago by