Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation Paper • 2408.13149 • Published Aug 23, 2024
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering Paper • 2503.16422 • Published Mar 20 • 14
OminiControl2: Efficient Conditioning for Diffusion Transformers Paper • 2503.08280 • Published Mar 11
Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection Paper • 2209.01589 • Published Sep 4, 2022
Image Editing As Programs with Diffusion Models Paper • 2506.04158 • Published 6 days ago • 21 • 2
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps Paper • 2505.18675 • Published 17 days ago • 23
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Paper • 2505.16990 • Published 18 days ago • 20
dKV-Cache: The Cache for Diffusion Language Models Paper • 2505.15781 • Published 19 days ago • 16
Running on Zero 330 330 OminiControl Art 🎨 Transform images into artistic styles like Studio Ghibli