PPI-SVRG: Unifying Prediction-Powered Inference and Variance Reduction for Semi-Supervised Optimization

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the PPI-SVRG framework for semi-supervised learning scenarios where labels are scarce but predictions from a pretrained model are available. By unifying prediction-based pseudo-inference (PPI) with stochastic variance-reduced gradient (SVRG), the method incorporates pretrained predictions as control variates into the reference gradient, effectively reducing the variance of stochastic optimization. Theoretical analysis reveals a mathematical equivalence between PPI and SVRG, showing that the convergence rate of the proposed algorithm depends solely on the geometric structure of the loss function, while the quality of predictions only affects the size of the convergence neighborhood. Empirical results demonstrate a 43–52% reduction in mean squared error on mean estimation tasks and a 2.7–2.9 percentage point improvement in test accuracy on MNIST with only 10% labeled data.

Technology Category

Application Category

📝 Abstract
We study semi-supervised stochastic optimization when labeled data is scarce but predictions from pre-trained models are available. PPI and SVRG both reduce variance through control variates -- PPI uses predictions, SVRG uses reference gradients. We show they are mathematically equivalent and develop PPI-SVRG, which combines both. Our convergence bound decomposes into the standard SVRG rate plus an error floor from prediction uncertainty. The rate depends only on loss geometry; predictions affect only the neighborhood size. When predictions are perfect, we recover SVRG exactly. When predictions degrade, convergence remains stable but reaches a larger neighborhood. Experiments confirm the theory: PPI-SVRG reduces MSE by 43--52\% under label scarcity on mean estimation benchmarks and improves test accuracy by 2.7--2.9 percentage points on MNIST with only 10\% labeled data.
Problem

Research questions and friction points this paper is trying to address.

semi-supervised optimization
prediction-powered inference
variance reduction
stochastic optimization
label scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

PPI-SVRG
variance reduction
semi-supervised optimization
control variates
prediction-powered inference
🔎 Similar Papers
No similar papers found.
R
Ruicheng Ao
Institute for Data, Systems, and Society, Massachusetts Institute of Technology
H
Hongyu Chen
Institute for Data, Systems, and Society, Massachusetts Institute of Technology
H
Haoyang Liu
Department of Mathematics, Washington University in Saint Louis
David Simchi-Levi
David Simchi-Levi
Professor of Engineering Systems at MIT
Operations Researcg
Will Wei Sun
Will Wei Sun
Associate Professor, Daniels School of Business, Purdue University
Machine LearningStatistics