[W4A8 FP8 Quantization] Release of DeepSeek-V3.1 with SGLang Support – Near-Lossless & 1.56x Speed Boost!
#28 opened about 12 hours ago
by
Carson
tool call for reasoning mode
#27 opened about 12 hours ago
by
shing3232
V3.1 seems to be pretty bad at everything except coding and mathematics. V3.1 看起来除了编程和数学之外,其他方面都很差。
👀
➕
5
4
#26 opened about 13 hours ago
by
qazqazqazqaz46
DeepSeek-V3.1全方位最新实测出炉(300+维度),欢迎进群交流讨论~
#25 opened about 14 hours ago
by
JEIN
有人尝试过本地通过vllm部署并连接到claude code吗?
#24 opened about 15 hours ago
by
Yuxin362
why all the scale values of attn out_proj have the none-zero mantissa 0b111010101010010101001111111101 ?
#22 opened about 18 hours ago
by
abcstar
Update system prompt to include tools
2
#21 opened about 21 hours ago
by
bchenfireworks
recommended temp?
1
#19 opened about 23 hours ago
by
createthis
Tool calling usage examples
#18 opened 1 day ago
by
1000Xia
Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?
1
#17 opened 1 day ago
by
Lissanro

不知道还有没有蒸馏模型
3
#13 opened 1 day ago
by
BlackLeee
bro you forgot to put it under the v3.1 collection
👀
1
1
#12 opened 1 day ago
by
DingzhenPearl
no score on swe under thinking mode
➕
👀
3
2
#11 opened 1 day ago
by
vitvamer

This model’s censorship is insane
👍
🧠
5
11
#10 opened 1 day ago
by
smile1030
请问simpleQA结果是否为笔误?
4
#9 opened 1 day ago
by
rnc000
Add base model metadata
#8 opened 1 day ago
by
davanstrien

梁文疯垃圾模型
😎
🧠
3
13
#7 opened 1 day ago
by
eiskalt
search是用think模式吗
#6 opened 1 day ago
by
awdrgyjilplij
Congratulations to DeepSeek, this version seems powerful for coding and agent development
🔥
🚀
1
1
#5 opened 1 day ago
by
Robin-Han

火速下载
#4 opened 1 day ago
by
HowardChenRV

Any plan to release the post-training recipe?
1
#3 opened 1 day ago
by
Yi30
Come on, third party bros! Deploy it!
👍
🔥
6
3
#2 opened 1 day ago
by
DingzhenPearl