Resources

View closed (2)

[W4A8 FP8 Quantization] Release of DeepSeek-V3.1 with SGLang Support – Near-Lossless & 1.56x Speed Boost!

#28 opened about 12 hours ago by

Carson

tool call for reasoning mode

#27 opened about 12 hours ago by

shing3232

V3.1 seems to be pretty bad at everything except coding and mathematics. V3.1 看起来除了编程和数学之外，其他方面都很差。

👀 ➕ 5

#26 opened about 13 hours ago by

qazqazqazqaz46

DeepSeek-V3.1全方位最新实测出炉（300+维度），欢迎进群交流讨论~

#25 opened about 14 hours ago by

JEIN

有人尝试过本地通过vllm部署并连接到claude code吗？

#24 opened about 15 hours ago by

Yuxin362

why all the scale values of attn out_proj have the none-zero mantissa 0b111010101010010101001111111101 ?

#22 opened about 18 hours ago by

abcstar

Update system prompt to include tools

#21 opened about 21 hours ago by

bchenfireworks

recommended temp?

#19 opened about 23 hours ago by

createthis

Tool calling usage examples

#18 opened 1 day ago by

1000Xia

Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?

#17 opened 1 day ago by

Lissanro

不知道还有没有蒸馏模型

#13 opened 1 day ago by

BlackLeee

bro you forgot to put it under the v3.1 collection

👀 1

#12 opened 1 day ago by

DingzhenPearl

no score on swe under thinking mode

➕ 👀 3

#11 opened 1 day ago by

vitvamer

This model’s censorship is insane

👍 🧠 5

#10 opened 1 day ago by

smile1030

请问simpleQA结果是否为笔误？

#9 opened 1 day ago by

rnc000

Add base model metadata

#8 opened 1 day ago by

davanstrien

梁文疯垃圾模型

😎 🧠 3

#7 opened 1 day ago by

eiskalt

search是用think模式吗

#6 opened 1 day ago by

awdrgyjilplij

Congratulations to DeepSeek, this version seems powerful for coding and agent development

🔥 🚀 1

#5 opened 1 day ago by

Robin-Han

火速下载

#4 opened 1 day ago by

HowardChenRV

Any plan to release the post-training recipe?

#3 opened 1 day ago by

Yi30

Come on, third party bros! Deploy it!

👍 🔥 6

#2 opened 1 day ago by

DingzhenPearl

火前留名

#1 opened 1 day ago by

loveaim