ImageNet Large Scale Visual Recognition Challenge Paper • 1409.0575 • Published Sep 1, 2014 • 9
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin Paper • 1512.02595 • Published Dec 8, 2015 • 2
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs Paper • 2410.12881 • Published Oct 15, 2024 • 1
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 15
Cold Fusion: Training Seq2Seq Models Together with Language Models Paper • 1708.06426 • Published Aug 21, 2017 • 1
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20 • 36
Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset Paper • 2508.15096 • Published Aug 20 • 2