Bayesian Influence Functions for Hessian-Free Data Attribution

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Classical influence functions fail in deep neural networks due to the ill-conditioning and non-invertibility of the Hessian matrix, compounded by extremely high parameter dimensionality. To address this, we propose the Local Bayesian Influence Function (BIF), the first influence framework integrating Bayesian inference—thereby circumventing explicit Hessian inversion and instead leveraging statistical properties of the loss landscape for data attribution. BIF employs stochastic-gradient Markov Chain Monte Carlo (SG-MCMC) to efficiently approximate the posterior distribution over model parameters, capturing high-order parameter interactions while naturally scaling to ultra-large models (e.g., billion- or trillion-parameter architectures). Experiments demonstrate that BIF achieves state-of-the-art performance in predicting retraining outcomes, significantly improving the accuracy, stability, and scalability of influence estimation. This work establishes a novel paradigm for quantifying data value in foundation models.

Technology Category

Application Category

📝 Abstract

Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments.

Problem

Research questions and friction points this paper is trying to address.

Addresses non-invertible Hessians in deep neural networks

Proposes Hessian-free Bayesian influence functions for data attribution

Scales influence analysis to billion-parameter neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian influence functions replace Hessian inversion

Uses loss landscape statistics via MCMC sampling

Scales efficiently to billion-parameter neural networks

🔎 Similar Papers

No similar papers found.