Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper โข 2501.01423 โข Published Jan 2 โข 43
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper โข 2504.02782 โข Published Apr 3 โข 58
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Paper โข 2503.10615 โข Published Mar 13 โข 17
Neural Gaffer: Relighting Any Object via Diffusion Paper โข 2406.07520 โข Published Jun 11, 2024 โข 6
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation Paper โข 2412.14015 โข Published Dec 18, 2024 โข 12
StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models Paper โข 2412.13188 โข Published Dec 17, 2024