Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Context attribution in RAG—i.e., identifying the key context segments upon which a generated response depends—faces challenges including high computational cost, reliance on fine-tuning, and dependence on human annotations or proxy models. This paper proposes a zero-shot, fine-tuning-free, and proxy-model-free attribution method. We introduce the first attention-MLP co-analysis mechanism grounded in Jensen–Shannon divergence, revealing for the first time the critical roles of specific attention heads and MLP layers in attribution. By quantifying layer-wise neural activity sensitivity to context inputs, our approach enables fine-grained, sentence-level importance scoring. Evaluated on multilingual, multi-hop benchmarks—including TyDi QA, Hotpot QA, and Musique—our method achieves significantly higher attribution accuracy than state-of-the-art proxy-model-based approaches, while reducing inference overhead by an order of magnitude. The result is an interpretable, lightweight, plug-and-play attribution solution for trustworthy RAG.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) leverages large language models (LLMs) combined with external contexts to enhance the accuracy and reliability of generated responses. However, reliably attributing generated content to specific context segments, context attribution, remains challenging due to the computationally intensive nature of current methods, which often require extensive fine-tuning or human annotation. In this work, we introduce a novel Jensen-Shannon Divergence driven method to Attribute Response to Context (ARC-JSD), enabling efficient and accurate identification of essential context sentences without additional fine-tuning or surrogate modelling. Evaluations on a wide range of RAG benchmarks, such as TyDi QA, Hotpot QA, and Musique, using instruction-tuned LLMs in different scales demonstrate superior accuracy and significant computational efficiency improvements compared to the previous surrogate-based method. Furthermore, our mechanistic analysis reveals specific attention heads and multilayer perceptron (MLP) layers responsible for context attribution, providing valuable insights into the internal workings of RAG models.
Problem

Research questions and friction points this paper is trying to address.

Efficiently attributing generated content to specific context segments in RAG
Reducing computational intensity of current context attribution methods
Identifying internal model components responsible for context attribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Jensen-Shannon Divergence for context attribution
Eliminates need for fine-tuning or surrogate modeling
Identifies key attention heads and MLP layers
🔎 Similar Papers
No similar papers found.