UnimixLM pietrolesci/small_bpe128k Updated Aug 8, 2025 pietrolesci/small_multigram128k Updated Jul 24, 2025 pietrolesci/small_tokmix128k Updated Jul 25, 2025 pietrolesci/small_unigramlm128k Updated Jul 27, 2025
Interesting Pre-Training Datasets Zyphra/Zyda-2 Preview • Updated Aug 6, 2025 • 235k • 90 HuggingFaceTB/dclm-edu Viewer • Updated Mar 7, 2025 • 1B • 3.17k • 31 HuggingFaceFW/fineweb-edu Viewer • Updated Jul 11, 2025 • 3.5B • 318k • 1.01k HuggingFaceTB/stack-edu Viewer • Updated Mar 20, 2025 • 167M • 3.36k • 68
UnimixLM pietrolesci/small_bpe128k Updated Aug 8, 2025 pietrolesci/small_multigram128k Updated Jul 24, 2025 pietrolesci/small_tokmix128k Updated Jul 25, 2025 pietrolesci/small_unigramlm128k Updated Jul 27, 2025
Interesting Pre-Training Datasets Zyphra/Zyda-2 Preview • Updated Aug 6, 2025 • 235k • 90 HuggingFaceTB/dclm-edu Viewer • Updated Mar 7, 2025 • 1B • 3.17k • 31 HuggingFaceFW/fineweb-edu Viewer • Updated Jul 11, 2025 • 3.5B • 318k • 1.01k HuggingFaceTB/stack-edu Viewer • Updated Mar 20, 2025 • 167M • 3.36k • 68