Bayesian Influence Functions for Hessian-Free Data Attribution

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Classical influence functions fail in deep neural networks due to the ill-conditioning and non-invertibility of the Hessian matrix, compounded by extremely high parameter dimensionality. To address this, we propose the Local Bayesian Influence Function (BIF), the first influence framework integrating Bayesian inference—thereby circumventing explicit Hessian inversion and instead leveraging statistical properties of the loss landscape for data attribution. BIF employs stochastic-gradient Markov Chain Monte Carlo (SG-MCMC) to efficiently approximate the posterior distribution over model parameters, capturing high-order parameter interactions while naturally scaling to ultra-large models (e.g., billion- or trillion-parameter architectures). Experiments demonstrate that BIF achieves state-of-the-art performance in predicting retraining outcomes, significantly improving the accuracy, stability, and scalability of influence estimation. This work establishes a novel paradigm for quantifying data value in foundation models.

Technology Category

Application Category

📝 Abstract
Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments.
Problem

Research questions and friction points this paper is trying to address.

Addresses non-invertible Hessians in deep neural networks
Proposes Hessian-free Bayesian influence functions for data attribution
Scales influence analysis to billion-parameter neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian influence functions replace Hessian inversion
Uses loss landscape statistics via MCMC sampling
Scales efficiently to billion-parameter neural networks
🔎 Similar Papers
No similar papers found.
P
Philipp Alexander Kreer
Technical University of Munich
W
Wilson Wu
University of Colorado Boulder
M
Maxwell Adam
University of Melbourne
Zach Furman
Zach Furman
Research Fellow, Timaeus
AlignmentInterpretabilityScience of deep learning
Jesse Hoogland
Jesse Hoogland
Executive Director, Timaeus
Singular learning theoryDevelopmental InterpretabilityAI safetyAI alignment