Data-Efficient Prediction-Powered Calibration via Cross-Validation

📅 2025-07-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
When calibration data are scarce, AI models struggle to reliably quantify predictive uncertainty. To address this, we propose a prediction-enabled calibration framework that jointly optimizes a synthetic label generator and estimates its systematic bias—without requiring additional ground-truth labels. Leveraging cross-validation, the method constructs bias-aware prediction intervals that guarantee strict statistical coverage. Crucially, label synthesis, bias estimation, and predictive calibration are unified within a single optimization objective, achieving superior trade-offs between data efficiency and calibration accuracy. Experiments on indoor localization demonstrate substantial improvements in calibration performance: under extreme sparsity (only 5–10 calibration samples), our approach maintains both high coverage probability and narrow interval width, outperforming existing methods in reliability and precision.

Technology Category

Application Category

📝 Abstract
Calibration data are necessary to formally quantify the uncertainty of the decisions produced by an existing artificial intelligence (AI) model. To overcome the common issue of scarce calibration data, a promising approach is to employ synthetic labels produced by a (generally different) predictive model. However, fine-tuning the label-generating predictor on the inference task of interest, as well as estimating the residual bias of the synthetic labels, demand additional data, potentially exacerbating the calibration data scarcity problem. This paper introduces a novel approach that efficiently utilizes limited calibration data to simultaneously fine-tune a predictor and estimate the bias of the synthetic labels. The proposed method yields prediction sets with rigorous coverage guarantees for AI-generated decisions. Experimental results on an indoor localization problem validate the effectiveness and performance gains of our solution.
Problem

Research questions and friction points this paper is trying to address.

Efficiently calibrate AI models with scarce data
Fine-tune predictor and estimate synthetic label bias
Ensure rigorous coverage for AI-generated decision sets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses cross-validation for efficient data utilization
Simultaneously fine-tunes predictor and estimates bias
Ensures rigorous coverage for AI-generated decisions
🔎 Similar Papers
No similar papers found.
S
Seonghoon Yoo
Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, South Korea
Houssem Sifaou
Houssem Sifaou
Research Associate, Electrical Engineering
Wireless communicationssignal processingmachine learning
S
Sangwoo Park
King’s Communications, Learning & Information Processing (KCLIP) Lab, Centre for Intelligent Information Processing Systems (CIIPS), Department of Engineering, King’s College London, WC2R 2LS London, U.K.
Joonhyuk Kang
Joonhyuk Kang
Professor of Electrical Engineering, KAIST
Signal Processing and Machine Learning for Wireless Communication Systems
Osvaldo Simeone
Osvaldo Simeone
King's College London
Information theorymachine learningquantum information processingwireless systems