ReFoCUS: Reinforcement-guided Frame Optimization for Contextual Understanding Paper • 2506.01274 • Published Jun 2 • 3
Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing Paper • 2411.19460 • Published Nov 29, 2024 • 11
SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis Paper • 2411.16173 • Published Nov 25, 2024 • 10
CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models Paper • 2406.01920 • Published Jun 4, 2024 • 1
Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning Paper • 2307.07250 • Published Jul 14, 2023 • 2
Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression Paper • 2303.01052 • Published Mar 2, 2023 • 3
Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck Paper • 2204.02735 • Published Apr 6, 2022 • 4
Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network Paper • 2204.02738 • Published Apr 6, 2022 • 3
What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models Paper • 2403.13513 • Published Mar 20, 2024 • 1