CORA: Conformal Risk-Controlled Agents for Safeguarded Mobile GUI Automation

📅 2026-04-10

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Mobile autonomous GUI agents can incur irreversible financial, privacy, or social risks, yet existing safety mechanisms lack formal guarantees and user controllability. This work proposes CORA, the first framework to integrate Conformal Risk Control into GUI automation safety. CORA employs a Guardian model to perform multimodal risk assessment prior to action execution and dynamically decides whether to abort high-risk operations. When an action is aborted, a Diagnostician model recommends targeted interventions, while a Goal-Lock mechanism anchors the agent to the user’s original intent to defend against visual injection attacks. Experiments on the newly introduced Phone-Harm benchmark and public datasets demonstrate that CORA significantly improves the Pareto frontier among safety, helpfulness, and interruption frequency, delivering a practical, user-tunable safety solution with statistical risk guarantees.

Technology Category

Application Category

📝 Abstract

Graphical user interface (GUI) agents powered by vision language models (VLMs) are rapidly moving from passive assistance to autonomous operation. However, this unrestricted action space exposes users to severe and irreversible financial, privacy or social harm. Existing safeguards rely on prompt engineering, brittle heuristics and VLM-as-critic lack formal verification and user-tunable guarantees. We propose CORA (COnformal Risk-controlled GUI Agent), a post-policy, pre-action safeguarding framework that provides statistical guarantees on harmful executed actions. CORA reformulates safety as selective action execution: we train a Guardian model to estimate action-conditional risk for each proposed step. Rather than thresholding raw scores, we leverage Conformal Risk Control to calibrate an execute/abstain boundary that satisfies a user-specified risk budget and route rejected actions to a trainable Diagnostician model, which performs multimodal reasoning over rejected actions to recommend interventions (e.g., confirm, reflect, or abort) to minimize user burden. A Goal-Lock mechanism anchors assessment to a clarified, frozen user intent to resist visual injection attacks. To rigorously evaluate this paradigm, we introduce Phone-Harm, a new benchmark of mobile safety violations with step-level harm labels under real-world settings. Experiments on Phone-Harm and public benchmarks against diverse baselines validate that CORA improves the safety--helpfulness--interruption Pareto frontier, offering a practical, statistically grounded safety paradigm for autonomous GUI execution. Code and benchmark are available at cora-agent.github.io.

Problem

Research questions and friction points this paper is trying to address.

GUI automation

safety

risk control

autonomous agents

harm prevention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal Risk Control

Selective Action Execution

Guardian-Diagnostician Architecture