First-Order Efficiency for Probabilistic Value Estimation via A Statistical Viewpoint

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the high computational complexity in estimating probabilistic values—such as Shapley values—arising from the exponential number of coalition evaluations. By analyzing existing Monte Carlo estimators through a unified perspective, the study uncovers a shared first-order error structure characterized by an augmented inverse probability weighting influence term and derives an explicit mean squared error expression. Building on this insight, the authors propose the Efficiency-Aware Surrogate-adjusted Estimator (EASE), which jointly optimizes the sampling strategy and surrogate function to enhance estimation accuracy. Empirical evaluations demonstrate that EASE consistently outperforms current state-of-the-art methods across diverse probabilistic value estimation tasks, achieving substantial improvements in both statistical efficiency and estimation precision.

📝 Abstract

Probabilistic values, including Shapley values and semivalues, provide a model-agnostic framework to attribute the behavior of a black-box model to data points or features, with a wide range of applications including explainable artificial intelligence and data valuation. However, their exact computation requires utility evaluations over exponentially many coalitions, making Monte Carlo approximation essential in modern machine learning applications. Existing estimators are often developed through different identification strategies, including weighted averages, self-normalized weighting, regression adjustment, and weighted least squares. Our key observation is that these seemingly distinct constructions share a common first-order error structure, in which the leading term is an augmented inverse-probability weighted influence term determined by the sampling law and a working surrogate function. This first-order representation yields an explicit expression for the leading mean squared error (MSE), which characterizes how the sampling law and the surrogate jointly determine statistical efficiency. Guided by this criterion, we propose an Efficiency-Aware Surrogate-adjusted Estimator (EASE) that directly chooses the sampling law and surrogate to minimize the first-order MSE. We demonstrate that EASE consistently outperforms state-of-the-art estimators for various probabilistic values.

Problem

Research questions and friction points this paper is trying to address.

probabilistic values

Monte Carlo estimation

statistical efficiency

Shapley values

semivalues

Innovation

Methods, ideas, or system contributions that make the work stand out.

first-order efficiency

probabilistic value estimation

Shapley values