π€ AI Summary
This study addresses the current lack of a systematic understanding of user interaction mechanisms with large language modelβdriven computer-use agents and the key design factors influencing their user experience (UX). Through a two-stage approach, the authors construct and empirically validate a UX design space for such agents. First, they synthesize findings from a literature review and expert interviews to develop a taxonomy encompassing dimensions such as user prompting, explainability, and user control. Second, they conduct a Wizard-of-Oz experiment across normal, error, and high-risk scenarios to observe user behaviors, revealing interdependencies among design dimensions and the diversity of user needs. This work presents the first systematically formulated and empirically validated UX design framework for LLM-driven agents, offering developers a structured and actionable foundation for design decisions.
π Abstract
Large language model (LLM)-based computer use agents execute user commands by interacting with available UI elements, but little is known about how users want to interact with these agents or what design factors matter for their user experience (UX). We conducted a two-phase study to map the UX design space for computer use agents. In Phase 1, we reviewed existing systems to develop a taxonomy of UX considerations, then refined it through interviews with eight UX and AI practitioners. The resulting taxonomy included categories such as user prompts, explainability, user control, and users'mental models, with corresponding subcategories and example design features. In Phase 2, we ran a Wizard-of-Oz study with 20 participants, where a researcher acted as a web-based computer use agent and probed user reactions during normal, error-prone and risky execution. We used the findings to validate the taxonomy from Phase 1 and deepen our understand of the design space by identifying the connections between design areas and divergence in user needs and scenarios. Our taxonomy and empirical insights provide a map for developers to consider different aspects of user experience in computer use agent design and to situate their designs within users'diverse needs and scenarios.