Parameters / Experts - How to run this model ;
#16 opened 13 days ago
by
DavidAU

DeepSeek R1 0528?
#15 opened 24 days ago
by
Thireus
This model almost completely loses Chinese ablities
π
1
3
#14 opened about 2 months ago
by
CHNtentes
Base version?
β
3
2
#13 opened about 2 months ago
by
ToastyPigeon

Russian language is missing
1
#12 opened about 2 months ago
by
Kosh69
Please, share the custom vLLM source you made
π
1
#11 opened about 2 months ago
by
hyunw55
Update metadata π€
#10 opened about 2 months ago
by
merve

Model seems to not be performing correctly
1
#9 opened about 2 months ago
by
daniel-ltw
Larger model?
π§
2
#8 opened about 2 months ago
by
blobbybob

number of experts +
π§
π₯
2
#7 opened about 2 months ago
by
Danioken
Brainstorming
π§
5
5
#6 opened about 2 months ago
by
Downtown-Case
Further training/distillation needed?
π
1
1
#5 opened about 2 months ago
by
mingyi456
Besides pruning..
6
#4 opened about 2 months ago
by
Lockout

Context size? YaRN still supported?
2
#3 opened about 2 months ago
by
Thireus
Variants
#2 opened about 2 months ago
by
someone13574
code
β
16
#1 opened about 2 months ago
by
mrfakename
