Pre-trained Large Language Models Learn Hidden Markov Models In-context Paper • 2506.07298 • Published 2 days ago • 16
SpatialLM: Training Large Language Models for Structured Indoor Modeling Paper • 2506.07491 • Published 2 days ago • 31
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published 7 days ago • 57
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper • 2505.14669 • Published 21 days ago • 73
Vid2World: Crafting Video Diffusion Models to Interactive World Models Paper • 2505.14357 • Published 22 days ago • 26
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective Paper • 2505.15045 • Published 21 days ago • 54
CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition Paper • 2505.13380 • Published 22 days ago • 5
Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning Paper • 2505.13866 • Published 22 days ago • 16
Training-Free Watermarking for Autoregressive Image Generation Paper • 2505.14673 • Published 21 days ago • 12
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper • 2505.11594 • Published 25 days ago • 72
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published 21 days ago • 130