AICO: Feature Significance Tests for Supervised Learning

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Supervised learning models’ black-box nature severely limits their trustworthy deployment in high-stakes applications. This paper proposes a model-agnostic, distribution-unbiased framework for feature significance testing, applicable to both regression and classification tasks. Our method quantifies each feature’s incremental contribution via mask-based perturbations—without retraining the original model or introducing auxiliary models—and introduces the first optimal randomized sign test based on the median performance difference. Theoretically, it guarantees exact p-values and valid confidence intervals, ensuring both statistical rigor and computational efficiency. Empirical evaluation on synthetic data confirms its accuracy and robustness under distributional shifts; on real-world high-dimensional datasets, it yields reproducible and reliable feature importance assessments. By unifying statistical hypothesis testing with interpretability, our framework establishes a novel paradigm for trustworthy, explainable AI.

Technology Category

Application Category

📝 Abstract
The opacity of many supervised learning algorithms remains a key challenge, hindering scientific discovery and limiting broader deployment -- particularly in high-stakes domains. This paper develops model- and distribution-agnostic significance tests to assess the influence of input features in any regression or classification algorithm. Our method evaluates a feature's incremental contribution to model performance by masking its values across samples. Under the null hypothesis, the distribution of performance differences across a test set has a non-positive median. We construct a uniformly most powerful, randomized sign test for this median, yielding exact p-values for assessing feature significance and confidence intervals with exact coverage for estimating population-level feature importance. The approach requires minimal assumptions, avoids model retraining or auxiliary models, and remains computationally efficient even for large-scale, high-dimensional settings. Experiments on synthetic tasks validate its statistical and computational advantages, and applications to real-world data illustrate its practical utility.
Problem

Research questions and friction points this paper is trying to address.

Tests feature influence in supervised learning models
Evaluates feature contribution via masking and performance difference
Provides exact p-values and confidence intervals efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-agnostic feature significance tests
Masking feature values for performance evaluation
Exact p-values and confidence intervals
🔎 Similar Papers
No similar papers found.
Kay Giesecke
Kay Giesecke
Professor of Management Science and Engineering, Stanford University
Financial TechnologyMachine LearningStatisticsMonte Carlo SimulationStochastic Modeling
E
Enguerrand Horel
Upstart, Inc.
C
Chartsiri Jirachotkulthorn
Stanford University, Institute for Computational and Mathematical Engineering