Prediction-Powered Inference with Inverse Probability Weighting

📅 2025-08-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses informative missingness in partially labeled data arising from non-random label acquisition mechanisms—i.e., where the labeling probability varies across samples. We propose a prediction-augmented inverse probability weighting (IPW) inference framework. Methodologically, we integrate Horvitz–Thompson and Hájek weighting principles from survey sampling into predictive modeling, jointly leveraging predictions on unlabeled data and an estimated inclusion probability model fitted on labeled data to adaptively calibrate bias-correction terms. Our key contribution is the first IPW estimator proven to achieve statistical validity under variable labeling probabilities while rigorously preserving nominal confidence interval coverage. Simulation results demonstrate that our IPW approach—using estimated propensity scores—substantially reduces variance, attaining performance close to the oracle setting with known labeling probabilities, and exhibits both robustness and efficiency.

Technology Category

Application Category

📝 Abstract
Prediction-powered inference (PPI) is a recent framework for valid statistical inference with partially labeled data, combining model-based predictions on a large unlabeled set with bias correction from a smaller labeled subset. We show that PPI can be extended to handle informative labeling by replacing its unweighted bias-correction term with an inverse probability weighted (IPW) version, using the classical Horvitz--Thompson or Hájek forms. This connection unites design-based survey sampling ideas with modern prediction-assisted inference, yielding estimators that remain valid when labeling probabilities vary across units. We consider the common setting where the inclusion probabilities are not known but estimated from a correctly specified model. In simulations, the performance of IPW-adjusted PPI with estimated propensities closely matches the known-probability case, retaining both nominal coverage and the variance-reduction benefits of PPI.
Problem

Research questions and friction points this paper is trying to address.

Extends PPI to handle informative labeling via IPW
Unites survey sampling with prediction-assisted inference
Validates IPW-adjusted PPI with estimated propensities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inverse probability weighting for bias correction
Combines prediction-assisted inference with survey sampling
Estimates inclusion probabilities from correct model
🔎 Similar Papers