Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
12
Ziyue Li
Litzy619
Follow
L0I6T1Z9Y
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 3 hours ago
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
upvoted
a
paper
about 3 hours ago
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
upvoted
a
paper
5 days ago
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing
View all activity
Organizations
None yet
models
701
Sort: Recently updated
Litzy619/OLMoE-1B-7B-0924-step1100000-tokens4613B-qlora
Updated
Feb 12
•
2
Litzy619/OLMoE-1B-7B-0924-step855000-tokens3586B-qlora
Updated
Feb 12
•
2
Litzy619/OLMoE-1B-7B-0924-step980000-tokens4110B-qlora
Updated
Feb 12
•
2
Litzy619/OLMoE-1B-7B-0924-step735000-tokens3082B-qlora
Updated
Feb 12
•
2
Litzy619/OLMoE-1B-7B-0924-step615000-tokens2579B-qlora
Updated
Feb 12
•
3
Litzy619/OLMoE-1B-7B-0924-step490000-tokens2055B-qlora
Updated
Feb 12
•
3
Litzy619/OLMoE-1B-7B-0924-step125000-tokens524B-qlora
Updated
Feb 12
•
4
Litzy619/OLMoE-1B-7B-0924-step245000-tokens1027B-qlora
Updated
Feb 12
•
1
Litzy619/OLMoE-1B-7B-0924-step370000-tokens1551B-qlora
Updated
Feb 12
•
4
Litzy619/OLMoE-1B-7B-0924-step5000-tokens20B-qlora
Updated
Feb 12
•
2
Expand 701 models
datasets
None public yet