Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 18 days ago • 25
MolmoAct Collection All models for the MolmoAct (Multimodal Open Language Model for Action) release. • 10 items • Updated 13 days ago • 33
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 6 items • Updated 6 days ago • 112
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 13 days ago • 87
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano v3. • 7 items • Updated 13 days ago • 54
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 18 items • Updated 5 days ago • 41
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices Paper • 2512.14052 • Published 21 days ago • 40
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 22 days ago • 103
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 71
UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios Paper • 2511.18050 • Published Nov 22, 2025 • 37