A Bayesian Information-Theoretic Approach to Data Attribution

📅 2026-04-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of efficiently identifying the key training samples underlying a model’s predictions to enhance interpretability and safety. It formulates data attribution as a Bayesian information-theoretic problem, using the increase in predictive entropy—i.e., information loss—induced by removing a sample as the attribution criterion, thereby prioritizing the reduction of prediction uncertainty over fitting label noise. The approach leverages Gaussian process surrogates and tangent features for efficient approximation and introduces a scalable information gain objective coupled with a variance correction mechanism, enabling compatibility with large-scale vector database retrieval. Empirically, the method demonstrates strong performance across counterfactual sensitivity, ground-truth attribution retrieval, and coreset selection tasks, offering both theoretical rigor and scalability to modern deep architectures.
📝 Abstract
Training Data Attribution (TDA) seeks to trace model predictions back to influential training examples, enhancing interpretability and safety. We formulate TDA as a Bayesian information-theoretic problem: subsets are scored by the information loss they induce - the entropy increase at a query when removed. This criterion credits examples for resolving predictive uncertainty rather than label noise. To scale to modern networks, we approximate information loss using a Gaussian Process surrogate built from tangent features. We show this aligns with classical influence scores for single-example attribution while promoting diversity for subsets. For even larger-scale retrieval, we relax to an information-gain objective and add a variance correction for scalable attribution in vector databases. Experiments show competitive performance on counterfactual sensitivity, ground-truth retrieval and coreset selection, showing that our method scales to modern architectures while bridging principled measures with practice.
Problem

Research questions and friction points this paper is trying to address.

Training Data Attribution
Bayesian
Information Theory
Interpretability
Influence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian information theory
training data attribution
information loss
Gaussian Process surrogate
scalable retrieval
🔎 Similar Papers
No similar papers found.