Mitigating Conversational Inertia in Multi-Turn Agents

๐Ÿ“… 2026-02-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the issue of โ€œdialogue inertiaโ€ in multi-turn agent interactions, where large language models tend to repetitively mimic their own prior responses, thereby limiting exploratory behavior. The study is the first to reveal a connection between this phenomenon and context length, and proposes a novel method for constructing preference pairs without requiring environmental rewards. By analyzing attention mechanisms to detect inertia and leveraging differences in context lengths to generate implicit preference signals, the approach integrates contextual preference learning with a dynamic context management strategy during inference. This effectively balances exploration and exploitation. Evaluated across eight agent environments and one in-depth case study, the method significantly reduces dialogue inertia and consistently improves task performance.

Technology Category

Application Category

๐Ÿ“ Abstract
Large language models excel as few-shot learners when provided with appropriate demonstrations, yet this strength becomes problematic in multiturn agent scenarios, where LLMs erroneously mimic their own previous responses as few-shot examples. Through attention analysis, we identify conversational inertia, a phenomenon where models exhibit strong diagonal attention to previous responses, which is associated with imitation bias that constrains exploration. This reveals a tension when transforming few-shot LLMs into agents: longer context enriches environmental feedback for exploitation, yet also amplifies conversational inertia that undermines exploration. Our key insight is that for identical states, actions generated with longer contexts exhibit stronger inertia than those with shorter contexts, enabling construction of preference pairs without environment rewards. Based on this, we propose Context Preference Learning to calibrate model preferences to favor low-inertia responses over highinertia ones. We further provide context management strategies at inference time to balance exploration and exploitation. Experimental results across eight agentic environments and one deep research scenario validate that our framework reduces conversational inertia and achieves performance improvements.
Problem

Research questions and friction points this paper is trying to address.

conversational inertia
multi-turn agents
imitation bias
large language models
exploration-exploitation tradeoff
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conversational Inertia
Context Preference Learning
Exploration-Exploitation Tradeoff
Attention Analysis
Multi-turn Agents
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yang Wan
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Zheng Cao
Zheng Cao
Nanjing University of Information Science and Technology
Medical AIAI4ScienceDeSci
Z
Zhenhao Zhang
University of Rochester, Rochester, NY, USA
Z
Zhengwen Zeng
Ant Group, Hangzhou, China
Shuheng Shen
Shuheng Shen
Ant Group
Machine LearningOptimizationPrivacy
C
Changhua Meng
Ant Group, Hangzhou, China
L
Linchao Zhu
College of Computer Science and Technology, Zhejiang University, Hangzhou, China