🤖 AI Summary
Current intelligent agents exhibit limited generalization capabilities when confronted with unseen or dynamically changing user interfaces, often failing to complete tasks robustly. This work presents the first systematic analysis of how Nielsen’s Ten Usability Heuristics impact agent performance, deriving interface design principles tailored specifically for agents and proposing safety-aware interface augmentation strategies. Through a controlled experimental environment—UI-Verse—combined with agent interaction modeling and dual human-agent evaluations, the study validates the efficacy of the proposed approach. Experimental results demonstrate that heuristic-based enhancements significantly improve both task success rates and efficiency for agents, with combined strategies yielding the best outcomes. Crucially, user studies confirm that these modifications do not compromise usability for human users.
📝 Abstract
Recent advances have enabled general computer-use agents that interpret screens and execute grounded actions from human instructions, yet they still struggle to generalize to unseen and evolving interfaces. While improving agent capability remains important, agent compatible interface design offers a complementary path by aligning interaction semantics with agent prior knowledge. In this paper, we revisit Nielsen 10 usability heuristics through the lens of computer-use agents, identifying which principles naturally transfer, where implicit design assumptions create agent specific failures, and how safe additive augmentations can improve robustness without harming human usability. To evaluate these ideas, we introduce UI-Verse, a suite of controlled environments built around functionally similar interfaces with different applied heuristics. Experiments show that our augmented heuristics consistently improve task completion and modestly improve efficiency, with combined heuristics yielding further gains. Human studies further show that these designs preserve the original interaction workflow without observable usability regressions. Overall, our findings highlight interface design as a practical complementary avenue for improving the reliability and generalization of computer use agents.