When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models (LLMs) suffer from performance degradation when processing ultra-long texts. Method: This paper proposes a noise-driven failure mechanism analysis framework, formally characterizing three critical noise sources: missing cross-chunk dependencies (task noise), internal model confusion induced by context expansion (model noise, superlinearly increasing), and inaccurate integration of chunked outputs (aggregation noise). Based on this framework, we rigorously define the effective boundary of multi-agent divide-and-conquer processing and design cross-chunk dependency modeling and dynamic aggregation strategies. Contribution/Results: Experiments on long-text QA, retrieval, and summarization demonstrate that lightweight models—equipped with optimized chunking and aggregation—significantly outperform GPT-4o’s single-pass long-context inference. We establish, for the first time, a theoretically grounded noise taxonomy for long-context failure, providing a generalizable pathway for co-optimizing chunking and aggregation mechanisms.

Technology Category

Application Category

📝 Abstract

We investigate the challenge of applying Large Language Models (LLMs) to long texts. We propose a theoretical framework that distinguishes the failure modes of long context tasks into three categories: cross-chunk dependence (task noise), confusion that grows with context size (model noise), and the imperfect integration of partial results (aggregator noise). Under this view, we analyze when it is effective to use multi-agent chunking, i.e., dividing a length sequence into smaller chunks and aggregating the processed results of each chunk. Our experiments on tasks such as retrieval, question answering, and summarization confirm both the theoretical analysis and the conditions that favor multi-agent chunking. By exploring superlinear model noise growth with input length, we also explain why, for large inputs, a weaker model configured with chunk-based processing can surpass a more advanced model like GPT4o applied in a single shot. Overall, we present a principled understanding framework and our results highlight a direct pathway to handling long contexts in LLMs with carefully managed chunking and aggregator strategies.

Problem

Research questions and friction points this paper is trying to address.

Analyzing failure modes in long context LLM tasks

Evaluating effectiveness of multi-agent chunking strategy

Explaining weaker models outperforming advanced ones with chunking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise decomposition framework for long context tasks

Multi-agent chunking for processing long sequences

Chunk-based strategies outperform single-shot advanced models

🔎 Similar Papers

No similar papers found.

Authors to Follow