GUIGuard: Toward a General Framework for Privacy-Preserving GUI Agents

📅 2026-01-26
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the critical privacy risks posed by GUI agents in automation, which often inadvertently leak sensitive information through uploaded interface screenshots, compounded by the lack of systematic approaches to identify and protect privacy across diverse interaction trajectories. To tackle this challenge, we propose GUIGuard—the first end-to-end privacy-preserving framework specifically designed for GUI agents—comprising three integrated stages: privacy identification, protection, and task execution under privacy constraints. We further introduce GUIGuard-Bench, a cross-platform benchmark encompassing 630 interaction trajectories and 13,830 region-level privacy-annotated screenshots. Experimental results reveal that existing agents exhibit alarmingly low privacy recognition accuracy (13.3% on Android and 1.4% on PC), whereas GUIGuard effectively masks sensitive content while preserving task semantics, demonstrating that robust privacy protection can be achieved without compromising task performance.

Technology Category

Application Category

📝 Abstract
GUI agents enable end-to-end automation through direct perception of and interaction with on-screen interfaces. However, these agents frequently access interfaces containing sensitive personal information, and screenshots are often transmitted to remote models, creating substantial privacy risks. These risks are particularly severe in GUI workflows: GUIs expose richer, more accessible private information, and privacy risks depend on interaction trajectories across sequential scenes. We propose GUIGuard, a three-stage framework for privacy-preserving GUI agents: (1) privacy recognition, (2) privacy protection, and (3) task execution under protection. We further construct GUIGuard-Bench, a cross-platform benchmark with 630 trajectories and 13,830 screenshots, annotated with region-level privacy grounding and fine-grained labels of risk level, privacy category, and task necessity. Evaluations reveal that existing agents exhibit limited privacy recognition, with state-of-the-art models achieving only 13.3% accuracy on Android and 1.4% on PC. Under privacy protection, task-planning semantics can still be maintained, with closed-source models showing stronger semantic consistency than open-source ones. Case studies on MobileWorld show that carefully designed protection strategies achieve higher task accuracy while preserving privacy. Our results highlight privacy recognition as a critical bottleneck for practical GUI agents. Project: https://futuresis.github.io/GUIGuard-page/
Problem

Research questions and friction points this paper is trying to address.

privacy-preserving
GUI agents
sensitive information
privacy risk
interaction trajectories
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-preserving GUI agents
privacy recognition
GUI benchmark
trajectory-based privacy
semantic consistency under protection
🔎 Similar Papers
No similar papers found.
Y
Yanxi Wang
Beijing Normal University; Zhongguancun Academy
Zhiling Zhang
Zhiling Zhang
Shanghai Jiao Tong University
NLP for mental healthKnowledge GraphAudio Analysis
W
Wenbo Zhou
University of Science and Technology of China
W
Weiming Zhang
University of Science and Technology of China
J
Jie Zhang
A*STAR
Qiannan Zhu
Qiannan Zhu
School of Artificial Intelligence, Beijing Normal University
knowledge graphrecommendation systeminformation retrieval
Y
Yu Shi
Zhongguancun Academy; Zhongguancun Institution of Artificial Intelligence
Shuxin Zheng
Shuxin Zheng
Deputy Director, Zhongguancun Institute of Artificial Intelligence
General AIGenerative AI
Jiyan He
Jiyan He
University of Science and Technology of China
Machine LearningAI for Science