IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding
Abstract
A novel input-aware backdoor attack method, IAG, manipulates vision-language models to ground specific objects in images regardless of user queries, using a text-conditional U-Net and reconstruction loss to ensure stealthiness.
Vision-language models (VLMs) have shown significant advancements in tasks such as visual grounding, where they localize specific objects in images based on natural language queries and images. However, security issues in visual grounding tasks for VLMs remain underexplored, especially in the context of backdoor attacks. In this paper, we introduce a novel input-aware backdoor attack method, IAG, designed to manipulate the grounding behavior of VLMs. This attack forces the model to ground a specific target object in the input image, regardless of the user's query. We propose an adaptive trigger generator that embeds the semantic information of the attack target's description into the original image using a text-conditional U-Net, thereby overcoming the open-vocabulary attack challenge. To ensure the attack's stealthiness, we utilize a reconstruction loss to minimize visual discrepancies between poisoned and clean images. Additionally, we introduce a unified method for generating attack data. IAG is evaluated theoretically and empirically, demonstrating its feasibility and effectiveness. Notably, our [email protected] on InternVL-2.5-8B reaches over 65\% on various testing sets. IAG also shows promising potential on manipulating Ferret-7B and LlaVA-1.5-7B with very little accuracy decrease on clean samples. Extensive specific experiments, such as ablation study and potential defense, also indicate the robustness and transferability of our attack.
Community
IAG introduces a novel input-aware backdoor attack method for Vision-Language Models, demonstrating its effectiveness in manipulating visual grounding tasks with minimal visual disruption, highlighting the security vulnerabilities in VLMs.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation (2025)
- One Object, Multiple Lies: A Benchmark for Cross-task Adversarial Attack on Unified Vision-Language Models (2025)
- Simulated Ensemble Attack: Transferring Jailbreaks Across Fine-tuned Vision-Language Models (2025)
- 3S-Attack: Spatial, Spectral and Semantic Invisible Backdoor Attack Against DNN Models (2025)
- Losing Control: Data Poisoning Attack on Guided Diffusion via ControlNet (2025)
- Proactive Disentangled Modeling of Trigger–Object Pairings for Backdoor Defense (2025)
- CapRecover: A Cross-Modality Feature Inversion Attack Framework on Vision Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper