Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation

📅 2025-05-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Context attribution in RAG—i.e., identifying the key context segments upon which a generated response depends—faces challenges including high computational cost, reliance on fine-tuning, and dependence on human annotations or proxy models. This paper proposes a zero-shot, fine-tuning-free, and proxy-model-free attribution method. We introduce the first attention-MLP co-analysis mechanism grounded in Jensen–Shannon divergence, revealing for the first time the critical roles of specific attention heads and MLP layers in attribution. By quantifying layer-wise neural activity sensitivity to context inputs, our approach enables fine-grained, sentence-level importance scoring. Evaluated on multilingual, multi-hop benchmarks—including TyDi QA, Hotpot QA, and Musique—our method achieves significantly higher attribution accuracy than state-of-the-art proxy-model-based approaches, while reducing inference overhead by an order of magnitude. The result is an interpretable, lightweight, plug-and-play attribution solution for trustworthy RAG.

Technology Category

Application Category

📝 Abstract

Retrieval-Augmented Generation (RAG) leverages large language models (LLMs) combined with external contexts to enhance the accuracy and reliability of generated responses. However, reliably attributing generated content to specific context segments, context attribution, remains challenging due to the computationally intensive nature of current methods, which often require extensive fine-tuning or human annotation. In this work, we introduce a novel Jensen-Shannon Divergence driven method to Attribute Response to Context (ARC-JSD), enabling efficient and accurate identification of essential context sentences without additional fine-tuning or surrogate modelling. Evaluations on a wide range of RAG benchmarks, such as TyDi QA, Hotpot QA, and Musique, using instruction-tuned LLMs in different scales demonstrate superior accuracy and significant computational efficiency improvements compared to the previous surrogate-based method. Furthermore, our mechanistic analysis reveals specific attention heads and multilayer perceptron (MLP) layers responsible for context attribution, providing valuable insights into the internal workings of RAG models.

Problem

Research questions and friction points this paper is trying to address.

Efficiently attributing generated content to specific context segments in RAG

Reducing computational intensity of current context attribution methods

Identifying internal model components responsible for context attribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Jensen-Shannon Divergence for context attribution

Eliminates need for fine-tuning or surrogate modeling

Identifies key attention heads and MLP layers

🔎 Similar Papers

No similar papers found.

Authors to Follow