Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published Apr 21 • 65
StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On Paper • 2312.01725 • Published Dec 4, 2023 • 4
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling Paper • 2411.14042 • Published Nov 21, 2024
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published Apr 19 • 28
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published Apr 19 • 28
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published Apr 19 • 28 • 2
DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes Paper • 2412.11100 • Published Dec 15, 2024 • 7
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation Paper • 2407.02371 • Published Jul 2, 2024 • 55
Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling Paper • 2411.18664 • Published Nov 27, 2024 • 24
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning Paper • 2410.09754 • Published Oct 13, 2024 • 8
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Paper • 2404.02905 • Published Apr 3, 2024 • 71