Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models Paper • 2306.00973 • Published Jun 1, 2023 • 3
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning Paper • 2408.11001 • Published Aug 20, 2024 • 13
SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass Paper • 2508.15769 • Published 1 day ago • 11
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published 8 days ago • 134
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published 15 days ago • 154
Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation Paper • 2508.03320 • Published 17 days ago • 59
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published 24 days ago • 64
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers Paper • 2506.05573 • Published Jun 5 • 78
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published Jun 23 • 33
From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios Paper • 2506.20279 • Published Jun 25 • 19
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Paper • 2503.16418 • Published Mar 20 • 36
Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales Paper • 2506.19713 • Published Jun 24 • 13