DYCP: Dynamic Context Pruning for Long-Form Dialogue with LLMs

๐Ÿ“… 2026-01-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the degradation in response quality and increased latency experienced by large language models (LLMs) in long conversations due to context inflation. Existing context management approaches often suffer from inefficiency or compromise dialogue coherence. To overcome these limitations, we propose DYCP, a lightweight dynamic context pruning method that, for the first time, dynamically segments and retrieves relevant memory at query time based on the current user inputโ€”without requiring predefined topic boundaries or additional LLM invocations. DYCP effectively preserves temporal structure and conversational coherence while significantly improving response quality and reducing latency across multiple LLMs, as demonstrated on three long-context dialogue benchmarks: LoCoMo, MT-Bench+, and SCM4LLMs.

Technology Category

Application Category

๐Ÿ“ Abstract
Large Language Models (LLMs) increasingly operate over long-form dialogues with frequent topic shifts. While recent LLMs support extended context windows, efficient management of dialogue history in practice is needed due to inference cost and latency constraints. We present DyCP, a lightweight context management method implemented outside the LLM that dynamically identifies and retrieves relevant dialogue segments conditioned on the current turn, without offline memory construction. DyCP manages dialogue context while preserving the sequential nature of dialogue without predefined topic boundaries, enabling adaptive and efficient context selection. Across three long-form dialogue benchmarks-LoCoMo, MT-Bench+, and SCM4LLMs-and multiple LLM backends, DyCP achieves competitive answer quality in downstream generation, with more selective context usage and improved inference efficiency.
Problem

Research questions and friction points this paper is trying to address.

context management
long-form dialogue
response latency
answer quality
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Context Pruning
Long-Form Dialogue
Context Management
LLMs
Adaptive Retrieval
๐Ÿ”Ž Similar Papers
No similar papers found.