🤖 AI Summary
This paper addresses open-set label shift (OSLS), where the target-domain label distribution shifts and contains unknown out-of-distribution (OOD) classes, requiring accurate label distribution estimation and classifier calibration without retraining. To this end, we propose the first OSLS modeling framework leveraging test-time OOD references. Our method introduces a theoretically grounded three-stage EM estimation procedure that relies solely on a pre-trained in-distribution (ID) classifier from the source domain and a lightweight ID/OOD binary classifier. Under mild assumptions on the OOD classifier—namely, that it achieves nontrivial but not necessarily perfect OOD detection—we establish provable error bounds and guarantee unbiased estimation of the OOD-class marginal probabilities. The approach enables zero-shot target-domain adaptation and achieves significant improvements in calibration accuracy across multiple OSLS benchmarks. Code is publicly available and empirically validated.
📝 Abstract
Open set label shift (OSLS) occurs when label distributions change from a source to a target distribution, and the target distribution has an additional out-of-distribution (OOD) class. In this work, we build estimators for both source and target open set label distributions using a source domain in-distribution (ID) classifier and an ID/OOD classifier. With reasonable assumptions on the ID/OOD classifier, the estimators are assembled into a sequence of three stages: 1) an estimate of the source label distribution of the OOD class, 2) an EM algorithm for Maximum Likelihood estimates (MLE) of the target label distribution, and 3) an estimate of the target label distribution of OOD class under relaxed assumptions on the OOD classifier. The sampling errors of estimates in 1) and 3) are quantified with a concentration inequality. The estimation result allows us to correct the ID classifier trained on the source distribution to the target distribution without retraining. Experiments on a variety of open set label shift settings demonstrate the effectiveness of our model. Our code is available at https://github.com/ChangkunYe/OpenSetLabelShift.