Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs

📅 2024-10-18
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies a novel positional bias in long-context large language models—termed “multi-relevant-span distance bias”—where model performance degrades significantly as the relative positional distance between multiple critical information spans increases. To systematically study this phenomenon, we introduce LongPiBench, the first benchmark supporting multi-span localization evaluation, and conduct comprehensive assessments across 11 state-of-the-art models. Our study is the first to quantitatively characterize and empirically validate this distance-dependent bias, moving beyond conventional single-span bias analyses. Experimental results show that although most models have mitigated the “middle-token forgetting” issue, they remain highly sensitive to inter-span distances—a bias consistently observed across both commercial and open-source models. This work provides a new analytical lens for long-context modeling and delivers a reproducible, span-aware evaluation infrastructure to advance research in context-length scaling and positional generalization.

Technology Category

Application Category

📝 Abstract
Positional bias in large language models (LLMs) hinders their ability to effectively process long inputs. A prominent example is the"lost in the middle"phenomenon, where LLMs struggle to utilize relevant information situated in the middle of the input. While prior research primarily focuses on single pieces of relevant information, real-world applications often involve multiple relevant information pieces. To bridge this gap, we present LongPiBench, a benchmark designed to assess positional bias involving multiple pieces of relevant information. Thorough experiments are conducted with five commercial and six open-source models. These experiments reveal that while most current models are robust against the"lost in the middle"issue, there exist significant biases related to the spacing of relevant information pieces. These findings highlight the importance of evaluating and reducing positional biases to advance LLM's capabilities.
Problem

Research questions and friction points this paper is trying to address.

Positional bias hinders LLMs' long-input processing
Multiple relevant information spacing causes model bias
Lost-in-middle issue persists in current LLM evaluations
Innovation

Methods, ideas, or system contributions that make the work stand out.

LongPiBench benchmark for multiple relevant information
Evaluates positional bias in long-context LLMs
Reveals bias related to relevant information spacing