🤖 AI Summary
This work addresses the limitations of existing active data acquisition methods, which rely on predictive uncertainty estimates that are often unreliable in deep learning and optimization tasks. The authors propose GOIMDA, a novel algorithm that, for the first time, integrates goal-oriented influence functions with inverse curvature to enable uncertainty-aware sample selection without explicit Bayesian posterior inference. Built upon first-order influence functions, GOIMDA efficiently combines the target gradient, training loss curvature, and sensitivity of model parameters to candidate samples into a computationally tractable acquisition criterion. Theoretically, this criterion is shown to approximate predictive entropy minimization. Empirical results demonstrate that GOIMDA significantly outperforms current active learning and Bayesian optimization baselines across image and text classification as well as hyperparameter tuning tasks, achieving target performance with fewer labeled examples or evaluations.
📝 Abstract
Active data acquisition is central to many learning and optimization tasks in deep neural networks, yet remains challenging because most approaches rely on predictive uncertainty estimates that are difficult to obtain reliably. To this end, we propose Goal-Oriented Influence- Maximizing Data Acquisition (GOIMDA), an active acquisition algorithm that avoids explicit posterior inference while remaining uncertainty-aware through inverse curvature. GOIMDA selects inputs by maximizing their expected influence on a user-specified goal functional, such as test loss, predictive entropy, or the value of an optimizer-recommended design. Leveraging first-order influence functions, we derive a tractable acquisition rule that combines the goal gradient, training-loss curvature, and candidate sensitivity to model parameters. We show theoretically that, for generalized linear models, GOIMDA approximates predictive-entropy minimization up to a correction term accounting for goal alignment and prediction bias, thereby, yielding uncertainty-aware behavior without maintaining a Bayesian posterior. Empirically, across learning tasks (including image and text classification) and optimization tasks (including noisy global optimization benchmarks and neural-network hyperparameter tuning), GOIMDA consistently reaches target performance with substantially fewer labeled samples or function evaluations than uncertainty-based active learning and Gaussian-process Bayesian optimization baselines.