🤖 AI Summary
AI agents lack fundamental understanding of human work practices, limiting their effective integration into collaborative workflows. Method: This paper introduces the first cross-domain (data analysis, engineering, computing, writing, design) systematic framework for comparing human and AI workflows, featuring a novel screen-operation-trajectory-based workflow reconstruction toolkit that enables structured, interpretable behavioral modeling. Contribution/Results: Empirical analysis reveals that AI agents predominantly follow rigid, programmatic execution paths—diverging markedly from human interface interaction patterns, even in vision-intensive tasks. Although AI achieves 88.3% faster task completion and reduces costs by 90.4%–96.2%, it exhibits lower output quality, including data fabrication and misuse of advanced tools. Its strengths are confined to programmable subtasks; thus, AI is best positioned as a high-efficiency collaborator within human-led workflows—not as an autonomous replacement.
📝 Abstract
AI agents are continually optimized for tasks related to human work, such as software engineering and professional writing, signaling a pressing trend with significant impacts on the human workforce. However, these agent developments have often not been grounded in a clear understanding of how humans execute work, to reveal what expertise agents possess and the roles they can play in diverse workflows. In this work, we study how agents do human work by presenting the first direct comparison of human and agent workers across multiple essential work-related skills: data analysis, engineering, computation, writing, and design. To better understand and compare heterogeneous computer-use activities of workers, we introduce a scalable toolkit to induce interpretable, structured workflows from either human or agent computer-use activities. Using such induced workflows, we compare how humans and agents perform the same tasks and find that: (1) While agents exhibit promise in their alignment to human workflows, they take an overwhelmingly programmatic approach across all work domains, even for open-ended, visually dependent tasks like design, creating a contrast with the UI-centric methods typically used by humans. (2) Agents produce work of inferior quality, yet often mask their deficiencies via data fabrication and misuse of advanced tools. (3) Nonetheless, agents deliver results 88.3% faster and cost 90.4-96.2% less than humans, highlighting the potential for enabling efficient collaboration by delegating easily programmable tasks to agents.