When Should Users Check? A Decision-Theoretic Model of Confirmation Frequency in Multi-Step AI Agent Tasks

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Existing AI agents face a confirmation-timing dilemma in multi-step tasks: end-state confirmation risks error propagation, while step-by-step confirmation incurs excessive operational overhead. Method: We propose a decision-theoretic model for optimizing intermediate confirmation points, formulating confirmation deployment as a minimum-time scheduling problem and incorporating a user error-correction behavioral pattern (CDCR) to guide policy generation. Our approach integrates formative research, theoretical modeling, and controlled user experiments. Contribution/Results: The method achieves Pareto-optimal trade-offs between interruption frequency and rollback cost. Empirical evaluation demonstrates that 81% of users significantly prefer the proposed approach; average task completion time decreases by 13.54%; and both controllability and execution efficiency—along with overall user experience—are improved.

Technology Category

Application Category

📝 Abstract

Existing AI agents typically execute multi-step tasks autonomously and only allow user confirmation at the end. During execution, users have little control, making the confirm-at-end approach brittle: a single error can cascade and force a complete restart. Confirming every step avoids such failures, but imposes tedious overhead. Balancing excessive interruptions against costly rollbacks remains an open challenge. We address this problem by modeling confirmation as a minimum time scheduling problem. We conducted a formative study with eight participants, which revealed a recurring Confirmation-Diagnosis-Correction-Redo (CDCR) pattern in how users monitor errors. Based on this pattern, we developed a decision-theoretic model to determine time-efficient confirmation point placement. We then evaluated our approach using a within-subjects study where 48 participants monitored AI agents and repaired their mistakes while executing tasks. Results show that 81 percent of participants preferred our intermediate confirmation approach over the confirm-at-end approach used by existing systems, and task completion time was reduced by 13.54 percent.

Problem

Research questions and friction points this paper is trying to address.

Optimizing confirmation frequency in multi-step AI tasks

Balancing user interruptions against error rollback costs

Determining time-efficient confirmation points through decision modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modeled confirmation as minimum time scheduling problem

Developed decision-theoretic model for confirmation placement

Implemented intermediate confirmation points reducing task time

🔎 Similar Papers

A Decision Theoretic Framework for Measuring AI Reliance