🤖 AI Summary
This work addresses the lack of a unified definition of autonomy in GUI agents, which hinders the evaluation of their capabilities, responsibilities, and associated risks. To resolve this gap, the paper introduces the first GUI Agent Autonomy (GAL) framework, which systematically delineates six progressive levels of autonomy in software interaction. Grounded in conceptual modeling and human-computer interaction analysis, the GAL framework establishes a standardized benchmark for assessing and comparing diverse GUI agents. By providing a clear taxonomy of autonomous behaviors, this framework supports the development of trustworthy, interpretable, and accountable human-agent interaction systems.
📝 Abstract
GUI agents are rapidly becoming a new interaction to software, allowing people to navigate web, desktop and mobile rather than execute them click by click. Yet ``agent''is described with radically different degrees of autonomy, obscuring capability, responsibility and risk. We call for conceptual clarity through GUI Agent Autonomy Levels (GAL), a six-level framework that makes autonomy explicit and helps benchmark progress toward trustworthy software interaction.