The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation Paper • 2503.12294 • Published 10 days ago • 1
Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts Paper • 2501.14334 • Published Jan 24 • 20
We Can't Understand AI Using our Existing Vocabulary Paper • 2502.07586 • Published Feb 11 • 10
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published Jan 30 • 18
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 146
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting Paper • 2502.05176 • Published Feb 7 • 32
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 213
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets Paper • 2502.01506 • Published Feb 3 • 36
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published Feb 13 • 34
MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published Feb 19 • 33
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published Jan 28 • 36
DeepFlow: Serverless Large Language Model Serving at Scale Paper • 2501.14417 • Published Jan 24 • 3