Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

deepseek-ai
/
DeepSeek-V2-Chat

Text Generation
Transformers
Safetensors
deepseek_v2
conversational
custom_code
text-generation-inference
Model card Files Files and versions Community
17
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

ds-v2-chat

#17 opened about 1 month ago by
Elon7111

Dddv

#16 opened 6 months ago by
Hxnnsns

NAN issue using FP16 to load the model

#15 opened 8 months ago by
joeltseng

ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`

👍 3
#14 opened 11 months ago by
kang1

How much memory is needed if you make the 128k context length

1
#13 opened about 1 year ago by
ggbondcxk

Implement MLA inference optimizations to DeepseekV2Attention

🔥 🤗 6
#12 opened about 1 year ago by
sy-chen

Can you provide a sample code for training with DeepSpeed ZeRO3?

2
#10 opened about 1 year ago by
SupercarryNg

Ollama support

👍 1
1
#9 opened about 1 year ago by
Dao3

MoE offloading strategy?

2
#8 opened about 1 year ago by
Minami-su

Update README.md

#7 opened about 1 year ago by
VanishingPsychopath

kv cache

👀 2
3
#6 opened about 1 year ago by
FrankWu

function/tool calling support

8
#5 opened about 1 year ago by
kaijietti

fail to run the example

8
#4 opened about 1 year ago by
Leymore

GPTQ plz

10
#3 opened about 1 year ago by
Parkerlambert123

vllm support

7
#2 opened about 1 year ago by
Sihangli

llama.cpp support

➕ 👍 18
5
#1 opened about 1 year ago by
cpumaxx
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs