arxiv:2508.09456

IAG: Input-aware Backdoor Attack on VLMs for Visual Grounding

Published on Aug 13

· Submitted by

Duke-de-Artois on Aug 14

Upvote

Authors:

Junxian Li ,

Di Zhang

Abstract

A novel input-aware backdoor attack method, IAG, manipulates vision-language models to ground specific objects in images regardless of user queries, using a text-conditional U-Net and reconstruction loss to ensure stealthiness.

AI-generated summary

Vision-language models (VLMs) have shown significant advancements in tasks such as visual grounding, where they localize specific objects in images based on natural language queries and images. However, security issues in visual grounding tasks for VLMs remain underexplored, especially in the context of backdoor attacks. In this paper, we introduce a novel input-aware backdoor attack method, IAG, designed to manipulate the grounding behavior of VLMs. This attack forces the model to ground a specific target object in the input image, regardless of the user's query. We propose an adaptive trigger generator that embeds the semantic information of the attack target's description into the original image using a text-conditional U-Net, thereby overcoming the open-vocabulary attack challenge. To ensure the attack's stealthiness, we utilize a reconstruction loss to minimize visual discrepancies between poisoned and clean images. Additionally, we introduce a unified method for generating attack data. IAG is evaluated theoretically and empirically, demonstrating its feasibility and effectiveness. Notably, our [email protected] on InternVL-2.5-8B reaches over 65\% on various testing sets. IAG also shows promising potential on manipulating Ferret-7B and LlaVA-1.5-7B with very little accuracy decrease on clean samples. Extensive specific experiments, such as ablation study and potential defense, also indicate the robustness and transferability of our attack.

View arXiv page View PDF Add to collection

Community

Duke-de-Artois

Paper author Paper submitter 9 days ago

IAG introduces a novel input-aware backdoor attack method for Vision-Language Models, demonstrating its effectiveness in manipulating visual grounding tasks with minimal visual disruption, highlighting the security vulnerabilities in VLMs.

librarian-bot

8 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.09456 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.09456 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.09456 in a Space README.md to link it from this page.