Drift-Bench: Diagnosing Cooperative Breakdowns in LLM Agents under Input Faults via Multi-Turn Interaction

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of large language models to execution risks when confronted with cooperative failures in user input—such as implicit intentions, ambiguous phrasing, or erroneous presuppositions—due to insufficient clarification capabilities. Current evaluation frameworks struggle to capture such multi-turn pragmatic failures. To bridge this gap, the paper introduces the first diagnostic benchmark specifically designed to assess multi-turn clarification under input perturbations. Grounded in classical pragmatics theory, the framework establishes a taxonomy of collaborative failures, incorporates a role-driven user simulator, and employs a state- and service-oriented interactive environment, along with the proposed Rise evaluation protocol. Experiments reveal significant performance degradation among mainstream agents across various failure types, with clarification efficacy modulated by both user roles and failure categories, thereby validating the framework’s effectiveness in evaluating pragmatic robustness and safety-related failures.

Technology Category

Application Category

📝 Abstract
As Large Language Models transition to autonomous agents, user inputs frequently violate cooperative assumptions (e.g., implicit intent, missing parameters, false presuppositions, or ambiguous expressions), creating execution risks that text-only evaluations do not capture. Existing benchmarks typically assume well-specified instructions or restrict evaluation to text-only, single-turn clarification, and thus do not measure multi-turn disambiguation under grounded execution risk. We introduce \textbf{Drift-Bench}, the first diagnostic benchmark that evaluates agentic pragmatics under input faults through multi-turn clarification across state-oriented and service-oriented execution environments. Grounded in classical theories of communication, \textbf{Drift-Bench} provides a unified taxonomy of cooperative breakdowns and employs a persona-driven user simulator with the \textbf{Rise} evaluation protocol. Experiments show substantial performance drops under these faults, with clarification effectiveness varying across user personas and fault types. \MethodName bridges clarification research and agent safety evaluation, enabling systematic diagnosis of failures that can lead to unsafe executions.
Problem

Research questions and friction points this paper is trying to address.

cooperative breakdowns
input faults
multi-turn interaction
LLM agents
execution risk
Innovation

Methods, ideas, or system contributions that make the work stand out.

Drift-Bench
cooperative breakdowns
multi-turn clarification
agent safety
input faults
🔎 Similar Papers
No similar papers found.
H
Han Bao
University of Notre Dame
Z
Zheyuan Zhang
University of Notre Dame
P
Pengcheng Jing
University of Notre Dame
Zhengqing Yuan
Zhengqing Yuan
PhD student, University of Notre Dame
NLPDeeplearningCV
K
Kaiwen Shi
University of Notre Dame
Y
Yanfang Ye
University of Notre Dame