ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published 4 days ago • 28
V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models Paper • 2502.09980 • Published Feb 14 • 4
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP Paper • 2408.10202 • Published Aug 19, 2024
AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting Paper • 2502.05176 • Published Feb 7 • 38
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks Paper • 2501.08326 • Published Jan 14 • 34
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 46
HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics Paper • 2408.17443 • Published Aug 30, 2024 • 2
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation Paper • 2410.21271 • Published Oct 28, 2024 • 7
Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction Paper • 2309.03900 • Published Sep 7, 2023 • 1
SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation Paper • 2401.11791 • Published Jan 22, 2024 • 1
PartDistill: 3D Shape Part Segmentation by Vision-Language Model Distillation Paper • 2312.04016 • Published Dec 7, 2023 • 1
Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi-Sensor Kalman Filter Paper • 2309.14655 • Published Sep 26, 2023 • 1
2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision Paper • 2310.12817 • Published Oct 19, 2023 • 1
Conditional Modeling Based Automatic Video Summarization Paper • 2311.12159 • Published Nov 20, 2023 • 1
Kinship Representation Learning with Face Componential Relation Paper • 2304.04546 • Published Apr 10, 2023 • 1
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation Paper • 1907.12743 • Published Jul 30, 2019 • 1
Temporal Attentive Alignment for Video Domain Adaptation Paper • 1905.10861 • Published May 26, 2019 • 1