Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address marginal coverage failure of conformal prediction under high-dimensional covariate shift—where labeled source data and unlabeled target data are available—this paper proposes LR-QR, a likelihood-ratio-regularized quantile regression method. Instead of explicitly estimating the high-dimensional likelihood ratio (LR), LR-QR jointly learns threshold functions via pinball loss while incorporating implicit LR-based regularization to enhance generalization. Crucially, we are the first to integrate algorithmic stability bounds into conformal coverage theory, thereby establishing rigorous theoretical guarantees for marginal coverage on the target domain. Experiments on the Communities and Crime regression and WILDS image classification benchmarks demonstrate that LR-QR significantly outperforms existing methods: it achieves the desired target-domain coverage level with high fidelity while maintaining controlled prediction interval width. The core innovation lies in circumventing the intractable high-dimensional LR estimation problem and introducing a stability-driven paradigm for provable coverage guarantees.

Technology Category

Application Category

📝 Abstract
We consider the problem of conformal prediction under covariate shift. Given labeled data from a source domain and unlabeled data from a covariate shifted target domain, we seek to construct prediction sets with valid marginal coverage in the target domain. Most existing methods require estimating the unknown likelihood ratio function, which can be prohibitive for high-dimensional data such as images. To address this challenge, we introduce the likelihood ratio regularized quantile regression (LR-QR) algorithm, which combines the pinball loss with a novel choice of regularization in order to construct a threshold function without directly estimating the unknown likelihood ratio. We show that the LR-QR method has coverage at the desired level in the target domain, up to a small error term that we can control. Our proofs draw on a novel analysis of coverage via stability bounds from learning theory. Our experiments demonstrate that the LR-QR algorithm outperforms existing methods on high-dimensional prediction tasks, including a regression task for the Communities and Crime dataset, and an image classification task from the WILDS repository.
Problem

Research questions and friction points this paper is trying to address.

Addresses conformal prediction under covariate shift
Constructs valid prediction sets for target domains
Introduces LR-QR for high-dimensional data challenges
Innovation

Methods, ideas, or system contributions that make the work stand out.

LR-QR algorithm
pinball loss
regularization without likelihood ratio
🔎 Similar Papers
2024-03-22arXiv.orgCitations: 7