Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
zk67 's Collections
LLM Evaluation
Foundation Models and AGI
Model Architecture
Instruction Tuning
Agent AI
Training
LLM Data
inference optimization
Ilya Papers
LLM Reasoning Papers
LLM Tech Report
LLM Post Training
LLM Pre-Train

Model Architecture

updated Jan 20
Upvote
-

  • Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

    Paper • 2212.05055 • Published Dec 9, 2022 • 5
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs