Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration Paper • 2406.18516 • Published Jun 26, 2024 • 1
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos Paper • 2501.13826 • Published 7 days ago • 21
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 8 days ago • 74
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published 15 days ago • 20
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published 15 days ago • 20
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published 15 days ago • 20
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper • 2501.08983 • Published 15 days ago • 20
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published 15 days ago • 15
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper • 2501.08994 • Published 15 days ago • 15
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper • 2501.07171 • Published 17 days ago • 49
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 23 days ago • 24
FRNet: Frustum-Range Networks for Scalable LiDAR Segmentation Paper • 2312.04484 • Published Dec 7, 2023
LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes Paper • 2501.04004 • Published 23 days ago • 1
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 23 days ago • 24
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving Paper • 2501.04005 • Published 23 days ago
OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies Paper • 2501.00326 • Published about 1 month ago • 1
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 23 days ago • 23
Efficient Diffusion Model for Image Restoration by Residual Shifting Paper • 2403.07319 • Published Mar 12, 2024
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 28 days ago • 11
view post Post 7349 Google drops Gemini 2.0 Flash Thinkinga new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and morenow available in anychat, try it out: akhaliq/anychat See translation 2 replies · 🚀 7 7 🔥 5 5 👀 2 2 👍 2 2 + Reply