DYCP: Dynamic Context Pruning for Long-Form Dialogue with LLMs

📅 2026-01-12

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

This work addresses the degradation in response quality and increased latency experienced by large language models (LLMs) in long conversations due to context inflation. Existing context management approaches often suffer from inefficiency or compromise dialogue coherence. To overcome these limitations, we propose DYCP, a lightweight dynamic context pruning method that, for the first time, dynamically segments and retrieves relevant memory at query time based on the current user input—without requiring predefined topic boundaries or additional LLM invocations. DYCP effectively preserves temporal structure and conversational coherence while significantly improving response quality and reducing latency across multiple LLMs, as demonstrated on three long-context dialogue benchmarks: LoCoMo, MT-Bench+, and SCM4LLMs.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) increasingly operate over long-form dialogues with frequent topic shifts. While recent LLMs support extended context windows, efficient management of dialogue history in practice is needed due to inference cost and latency constraints. We present DyCP, a lightweight context management method implemented outside the LLM that dynamically identifies and retrieves relevant dialogue segments conditioned on the current turn, without offline memory construction. DyCP manages dialogue context while preserving the sequential nature of dialogue without predefined topic boundaries, enabling adaptive and efficient context selection. Across three long-form dialogue benchmarks-LoCoMo, MT-Bench+, and SCM4LLMs-and multiple LLM backends, DyCP achieves competitive answer quality in downstream generation, with more selective context usage and improved inference efficiency.

Problem

Research questions and friction points this paper is trying to address.

context management

long-form dialogue

response latency

answer quality

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Context Pruning

Long-Form Dialogue

Context Management