Task Completion Agents are Not Ideal Collaborators

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

Existing evaluation paradigms for intelligent agents overemphasize single-turn task completion, neglecting the inherently iterative nature of real-world problems and the essential human–agent collaboration—where user goals are often ambiguous and dynamically evolving. Empirical evidence shows that state-of-the-art agents underperform in multi-turn collaborative settings, primarily due to their inability to sustain user engagement and support cognitive understanding through adaptive assistance. Method: We propose the Collaborative Effort Expansion (CEE) evaluation framework, the first to formally model the quantitative relationship between user engagement and agent utility, enabling systematic assessment of an agent’s capacity to foster shared understanding and scaffold collaborative processes over sustained interaction. Contribution/Results: Validated via integrated case studies and controlled simulations, CEE effectively identifies capability gaps of agents in authentic collaborative scenarios and provides both theoretical foundations and practical design principles for next-generation cognitive-augmenting collaborative agents.

Technology Category

Application Category

📝 Abstract

Current evaluations of agents remain centered around one-shot task completion, failing to account for the inherently iterative and collaborative nature of many real-world problems, where human goals are often underspecified and evolve. We argue for a shift from building and assessing task completion agents to developing collaborative agents, assessed not only by the quality of their final outputs but by how well they engage with and enhance human effort throughout the problem-solving process. To support this shift, we introduce collaborative effort scaling, a framework that captures how an agent's utility grows with increasing user involvement. Through case studies and simulated evaluations, we show that state-of-the-art agents often underperform in multi-turn, real-world scenarios, revealing a missing ingredient in agent design: the ability to sustain engagement and scaffold user understanding. Collaborative effort scaling offers a lens for diagnosing agent behavior and guiding development toward more effective interactions.

Problem

Research questions and friction points this paper is trying to address.

Evaluating agents solely on task completion ignores real-world collaboration needs

Agents lack sustained engagement and scaffolding in multi-turn interactions

Proposing collaborative effort scaling to measure agent utility with human involvement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces collaborative effort scaling framework

Shifts focus from task completion to collaboration

Measures agent utility through user engagement growth

🔎 Similar Papers

Human Delegation Behavior in Human-AI Collaboration: The Effect of Contextual Information