Extending Kernel Trick to Influence Functions

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This work addresses the infeasibility of computing influence functions when model size vastly exceeds dataset size. The authors propose a dual representation–based approach to influence function estimation, incorporating kernel methods into influence analysis. Under the assumption of linearizable models, they explicitly construct a dual formulation that reduces computational complexity to scale with the dataset size rather than the number of model parameters. This framework represents the first efficient influence computation method whose complexity is governed by data size, enabling accurate estimation of the effect of removing individual data points on model parameters, predictions, and loss—particularly advantageous in large-model, small-data regimes where conventional approaches incur prohibitive computational costs.
📝 Abstract
In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can be an efficient alternative to the original influence functions for estimating changes in parameters, model outputs and loss due to data point removal, when model size is large relative to dataset size, or when evaluating the original influence functions in parameter space is infeasible. The dual representation, however, is limited to linearizable models, which are models whose behavior can be approximated by their linearizations throughout training, and requires materializing a matrix, whose size grows with the product of model output dimension and dataset size.
Problem

Research questions and friction points this paper is trying to address.

influence functions
kernel trick
computational complexity
linearizable models
dual representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

influence functions
dual representation
kernel trick
linearizable models
computational complexity