🤖 AI Summary
Classical influence functions fail in deep neural networks due to the ill-conditioning and non-invertibility of the Hessian matrix, compounded by extremely high parameter dimensionality. To address this, we propose the Local Bayesian Influence Function (BIF), the first influence framework integrating Bayesian inference—thereby circumventing explicit Hessian inversion and instead leveraging statistical properties of the loss landscape for data attribution. BIF employs stochastic-gradient Markov Chain Monte Carlo (SG-MCMC) to efficiently approximate the posterior distribution over model parameters, capturing high-order parameter interactions while naturally scaling to ultra-large models (e.g., billion- or trillion-parameter architectures). Experiments demonstrate that BIF achieves state-of-the-art performance in predicting retraining outcomes, significantly improving the accuracy, stability, and scalability of influence estimation. This work establishes a novel paradigm for quantifying data value in foundation models.
📝 Abstract
Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments.