🤖 AI Summary
This work addresses the infeasibility of computing influence functions when model size vastly exceeds dataset size. The authors propose a dual representation–based approach to influence function estimation, incorporating kernel methods into influence analysis. Under the assumption of linearizable models, they explicitly construct a dual formulation that reduces computational complexity to scale with the dataset size rather than the number of model parameters. This framework represents the first efficient influence computation method whose complexity is governed by data size, enabling accurate estimation of the effect of removing individual data points on model parameters, predictions, and loss—particularly advantageous in large-model, small-data regimes where conventional approaches incur prohibitive computational costs.
📝 Abstract
In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can be an efficient alternative to the original influence functions for estimating changes in parameters, model outputs and loss due to data point removal, when model size is large relative to dataset size, or when evaluating the original influence functions in parameter space is infeasible. The dual representation, however, is limited to linearizable models, which are models whose behavior can be approximated by their linearizations throughout training, and requires materializing a matrix, whose size grows with the product of model output dimension and dataset size.