Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding

📅 2025-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Posterior Significance Attenuation (PSA)—a pervasive phenomenon in long-context large language models where attention scores at critical positions systematically decay with increasing context length, leading to performance degradation—lacks formal characterization and effective mitigation strategies. Method: This paper introduces Positional Contrastive Decoding (PCD), a fine-tuning-free decoding enhancement method that dynamically amplifies outputs at salient positions by contrastively calibrating long-range and local attention responses in the logits space. Contribution/Results: PCD achieves state-of-the-art performance on multiple long-context benchmarks (e.g., LongBench, NarrativeQA) with zero training overhead, significantly alleviating attention attenuation and improving long-text understanding and generation quality. By operating entirely at inference time without architectural or parameter modifications, PCD establishes a new paradigm for efficient, interpretable, and scalable long-context optimization.

Technology Category

Application Category

📝 Abstract
While Large Language Models (LLMs) support long contexts, they struggle with performance degradation within the context window. Current solutions incur prohibitive training costs, leaving statistical behaviors and cost-effective approaches underexplored. From the decoding perspective, we identify the Posterior Salience Attenuation (PSA) phenomenon, where the salience ratio correlates with long-text performance degradation. Notably, despite the attenuation, gold tokens still occupy high-ranking positions in the decoding space. Motivated by it, we propose the training-free Positional Contrastive Decoding (PCD) that contrasts the logits derived from long-aware attention with those from designed local-aware attention, enabling the model to focus on the gains introduced by large-scale short-to-long training. Through the analysis of long-term decay simulation, we demonstrate that PCD effectively alleviates attention score degradation. Experimental results show that PCD achieves state-of-the-art performance on long-context benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Addresses performance degradation in long-context LLMs
Identifies Posterior Salience Attenuation (PSA) phenomenon
Proposes training-free Positional Contrastive Decoding (PCD) solution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Positional Contrastive Decoding for long-context LLMs
Contrasts long-aware and local-aware attention logits
Training-free solution mitigates attention score degradation
🔎 Similar Papers
No similar papers found.
Z
Zikai Xiao
Zhejiang University
Z
Ziyang Wang
University of Science and Technology of China
Wen Ma
Wen Ma
University of Michigan, Amazon
Neural NetworksMachine Learning
Y
Yan Zhang
Bytedance
W
Wei Shen
Zhejiang University
Y
Yan Wang
Zhejiang University
L
Luqi Gong
Zhejiang Lab
Zuozhu Liu
Zuozhu Liu
Assistant Professor, Zhejiang University/University of Illinois Urbana-Champaign
deep learningvision-language modelsmedical AI