Learning U-Statistics with Active Inference

📅 2026-05-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

193K/year
🤖 AI Summary
This study addresses the challenge of efficiently estimating U-statistics under limited labeling budgets in real-world settings where label acquisition is costly, while ensuring valid statistical inference. The work introduces active learning to U-statistic estimation for the first time, proposing an active inference framework that integrates informative label querying with machine learning predictions. It designs an augmented inverse probability weighted U-statistic and derives the optimal sampling strategy that minimizes estimation variance. The proposed method substantially improves estimation efficiency, achieving comparable confidence interval coverage to baseline approaches on real datasets with significantly fewer labeled samples, and further enables empirical risk minimization based on U-statistics.
📝 Abstract
$U$-statistics play a central role in statistical inference. In many modern applications, however, acquiring the labels required for $U$-statistics is costly. Motivated by recent advances in active inference, we develop an active inference framework for $U$-statistics that selectively queries informative labels to improve estimation efficiency under a fixed labeling budget, while preserving valid statistical inference. Our approach is built on the augmented inverse probability weighting $U$-statistic, which is designed to incorporate the sampling rule and machine learning predictions. We characterize the optimal sampling rule that minimizes its variance and design practical sampling strategies. We further extend the framework to $U$-statistic-based empirical risk minimization. Experiments on real datasets demonstrate substantial gains in estimation efficiency over baseline methods, while maintaining target coverage.
Problem

Research questions and friction points this paper is trying to address.

U-statistics
active inference
labeling budget
statistical inference
estimation efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

active inference
U-statistics
inverse probability weighting
optimal sampling
empirical risk minimization
🔎 Similar Papers
No similar papers found.