Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ibm 's Collections
Materials
BioMed
✨ Highlights
Power-LM
Genie: Wishes datasets
🔬 Research
Paraphrase and perturbation question-answering robustness

Power-LM

updated Oct 17, 2024

Dense & MoE LLMs trained with power learning rate scheduler.

Upvote
15

  • Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

    Paper • 2408.13359 • Published Aug 23, 2024 • 25

  • ibm-research/PowerLM-3b

    Text Generation • Updated Sep 16, 2024 • 11.3k • 20

  • ibm-research/PowerMoE-3b

    Text Generation • Updated Sep 24, 2024 • 29.9k • 13
Upvote
15
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs