Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 26 days ago • 39
How to Synthesize Text Data without Model Collapse? Paper • 2412.14689 • Published about 1 month ago • 48
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies Paper • 2404.06395 • Published Apr 9, 2024 • 22