Estimating the Conformal Prediction Threshold from Noisy Labels

📅 2025-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge that conformal prediction fails to reliably calibrate its threshold when validation labels are noisy. We propose a noise-aware unsupervised calibration method that requires neither clean labels nor access to model internals, and—crucially—provides the first theoretical coverage guarantee for conformal prediction under validation sets with erroneous labels. The method accommodates diverse noise modeling assumptions (e.g., symmetric or instance-dependent noise) without imposing restrictive distributional assumptions on the data. By integrating noise modeling, statistical inference, and robust threshold estimation, our approach significantly outperforms existing noise-robust conformal methods across multi-class benchmarks—including ImageNet—and multiple natural and medical imaging datasets. It achieves coverage performance nearly matching that of the oracle baseline using clean validation labels, demonstrating strong robustness and practical utility.

Technology Category

Application Category

📝 Abstract
Conformal Prediction (CP) is a method to control prediction uncertainty by producing a small prediction set, ensuring a predetermined probability that the true class lies within this set. This is commonly done by defining a score, based on the model predictions, and setting a threshold on this score using a validation set. In this study, we address the problem of CP calibration when we only have access to a validation set with noisy labels. We show how we can estimate the noise-free conformal threshold based on the noisy labeled data. Our solution is flexible and can accommodate various modeling assumptions regarding the label contamination process, without needing any information about the underlying data distribution or the internal mechanisms of the machine learning classifier. We develop a coverage guarantee for uniform noise that is effective even in tasks with a large number of classes. We dub our approach Noise-Aware Conformal Prediction (NACP) and show on several natural and medical image classification datasets, including ImageNet, that it significantly outperforms current noisy label methods and achieves results comparable to those obtained with a clean validation set.
Problem

Research questions and friction points this paper is trying to address.

Error-prone validation set
Consistency correction
Estimation of error-free prediction bounds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise-Aware Consistency Prediction
Adaptive Error Mechanisms
Uniform Error Robustness
🔎 Similar Papers
No similar papers found.