Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

meituan
/
DeepSeek-R1-Channel-INT8

Text Generation
Transformers
Safetensors
deepseek_v3
conversational
custom_code
text-generation-inference
Model card Files Files and versions Community
12
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

can support ignore layers in w8a8_int8 quantization setting?

#12 opened 14 days ago by
jgfly

Can I run this model on AMD GPU? Or is it only compatible for Nvidia GPU?

#11 opened about 1 month ago by
luciagan

Update inference/bf16_cast_channel_int8.py

#10 opened 2 months ago by
HandH1998

Update config.json

#9 opened 2 months ago by
HandH1998

how to achieve 2500 tps throughput?

#8 opened 2 months ago by
muziyongshixin

can this model run with `ollama` with `pure cpu` model?

#7 opened 2 months ago by
ice6

Add `quantization_config` in config.json?

4
#4 opened 2 months ago by
WeiwenXia

运行channel INT8后sglang报错OOM

1
#3 opened 2 months ago by
zhangneilc
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs