Effective and Stealthy One-Shot Jailbreaks on Deployed Mobile Vision-Language Agents

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Prior research lacks in-depth investigation into the security vulnerabilities of Large Vision-Language Model (LVLM)-driven mobile agents operating in real-world UI environments, often relying on unrealistic threat assumptions—such as explicit coverage or high-privilege dependencies. Method: We propose a low-privilege, highly stealthy one-shot jailbreak attack: malicious UI text embedded within benign apps triggers hijacking; combined with low-privilege perception-chain injection, touch-behavior recognition for activation, and a one-time prompt evasion algorithm ensuring payload execution *only* during agent operation—fully bypassing human interaction. Contribution/Results: Our approach innovatively integrates physical touch-feature detection, ADB-driven injection, heuristic character-level iterative deepening A* search (HG-IDA*), and keyword-level detoxification. Evaluated across multiple backends—including GPT-4o—it achieves 82.5% planning hijack rate and 75.0% execution hijack rate, revealing, for the first time, systemic UI-layer security flaws common to mainstream LVLM-based mobile agents.

Technology Category

Application Category

📝 Abstract

Large vision-language models (LVLMs) enable autonomous mobile agents to operate smartphone user interfaces, yet vulnerabilities to UI-level attacks remain critically understudied. Existing research often depends on conspicuous UI overlays, elevated permissions, or impractical threat models, limiting stealth and real-world applicability. In this paper, we present a practical and stealthy one-shot jailbreak attack that leverages in-app prompt injections: malicious applications embed short prompts in UI text that remain inert during human interaction but are revealed when an agent drives the UI via ADB (Android Debug Bridge). Our framework comprises three crucial components: (1) low-privilege perception-chain targeting, which injects payloads into malicious apps as the agent's visual inputs; (2) stealthy user-invisible activation, a touch-based trigger that discriminates agent from human touches using physical touch attributes and exposes the payload only during agent operation; and (3) one-shot prompt efficacy, a heuristic-guided, character-level iterative-deepening search algorithm (HG-IDA*) that performs one-shot, keyword-level detoxification to evade on-device safety filters. We evaluate across multiple LVLM backends, including closed-source services and representative open-source models within three Android applications, and we observe high planning and execution hijack rates in single-shot scenarios (e.g., GPT-4o: 82.5% planning / 75.0% execution). These findings expose a fundamental security vulnerability in current mobile agents with immediate implications for autonomous smartphone operation.

Problem

Research questions and friction points this paper is trying to address.

Exposing security vulnerabilities in mobile vision-language agents

Developing stealthy jailbreak attacks via UI prompt injections

Bypassing safety filters through heuristic-guided search algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

In-app prompt injections for stealthy UI manipulation

Touch-based triggers distinguishing agent from human interaction

Heuristic-guided search algorithm evading safety filters

🔎 Similar Papers

Systematic Categorization, Construction and Evaluation of New Attacks against Multi-modal Mobile GUI Agents