Open Set Label Shift with Test Time Out-of-Distribution Reference

📅 2025-05-09

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This paper addresses open-set label shift (OSLS), where the target-domain label distribution shifts and contains unknown out-of-distribution (OOD) classes, requiring accurate label distribution estimation and classifier calibration without retraining. To this end, we propose the first OSLS modeling framework leveraging test-time OOD references. Our method introduces a theoretically grounded three-stage EM estimation procedure that relies solely on a pre-trained in-distribution (ID) classifier from the source domain and a lightweight ID/OOD binary classifier. Under mild assumptions on the OOD classifier—namely, that it achieves nontrivial but not necessarily perfect OOD detection—we establish provable error bounds and guarantee unbiased estimation of the OOD-class marginal probabilities. The approach enables zero-shot target-domain adaptation and achieves significant improvements in calibration accuracy across multiple OSLS benchmarks. Code is publicly available and empirically validated.

Technology Category

Application Category

📝 Abstract

Open set label shift (OSLS) occurs when label distributions change from a source to a target distribution, and the target distribution has an additional out-of-distribution (OOD) class. In this work, we build estimators for both source and target open set label distributions using a source domain in-distribution (ID) classifier and an ID/OOD classifier. With reasonable assumptions on the ID/OOD classifier, the estimators are assembled into a sequence of three stages: 1) an estimate of the source label distribution of the OOD class, 2) an EM algorithm for Maximum Likelihood estimates (MLE) of the target label distribution, and 3) an estimate of the target label distribution of OOD class under relaxed assumptions on the OOD classifier. The sampling errors of estimates in 1) and 3) are quantified with a concentration inequality. The estimation result allows us to correct the ID classifier trained on the source distribution to the target distribution without retraining. Experiments on a variety of open set label shift settings demonstrate the effectiveness of our model. Our code is available at https://github.com/ChangkunYe/OpenSetLabelShift.

Problem

Research questions and friction points this paper is trying to address.

Estimating source and target open set label distributions

Correcting ID classifier without retraining for target distribution

Quantifying sampling errors in OOD class estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimators for source and target label distributions

Three-stage EM algorithm for MLE estimates

Corrects ID classifier without retraining

🔎 Similar Papers

OpenSlot: Mixed Open-set Recognition with Object-centric Learning