Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
tokyotech-llm 's Collections
Llama-3.1, 3.3-Swallow-v0.5
SwallowMath
SwallowCode
Gemma-2-Swallow
Llama-3.3-Swallow
Llama-3.1-Swallow
Llama-3-Swallow
Swallow
Swallow-instruct
Swallow-MS
Swallow-MX
Swallow-MS-instruct

SwallowMath

updated 29 days ago

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Upvote
3

  • tokyotech-llm/swallow-math

    Viewer • Updated 26 days ago • 4.33M • 5.9k • 26

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0002500

    Updated 29 days ago • 6

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0005000

    Updated 29 days ago • 3

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0007500

    Updated 29 days ago • 3

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0010000

    Updated 29 days ago • 4

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp2-LR2.5e-5-WD0.1-iter0012500

    Updated 29 days ago • 10

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp1-LR2.5e-5-WD0.1-iter0002500

    Updated 29 days ago • 6

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp1-LR2.5e-5-WD0.1-iter0005000

    Updated 29 days ago • 3

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp1-LR2.5e-5-WD0.1-iter0007500

    Updated 29 days ago • 3

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp1-LR2.5e-5-WD0.1-iter0010000

    Updated 29 days ago • 4

  • tokyotech-llm/Llama-3.1-8B-math-ablation-exp1-LR2.5e-5-WD0.1-iter0012500

    Updated 29 days ago • 7
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs