CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models Paper • 2506.07463 • Published 19 days ago • 10
Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting Paper • 2505.19716 • Published May 26 • 5
🤓Small-Datasets Collection Multi-stage high-quality datasets makes the model more helpful! • 8 items • Updated 26 days ago • 3
🐶Doge-CheckPoints Collection A series of checkPoint weights that can continue training on new datasets without spikes of the training. • 6 items • Updated Apr 21 • 2
YuLan-Mini Collection A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details. • 6 items • Updated Apr 14 • 16
Wonderful Matrices: Combining for a More Efficient and Effective Foundation Model Architecture Paper • 2412.11834 • Published Dec 16, 2024 • 8
Cheems: Wonderful Matrices More Efficient and More Effective Architecture Paper • 2407.16958 • Published Jul 24, 2024 • 4