Robust Classification under Noisy Labels: A Geometry-Aware Reliability Framework for Foundation Models

📅 2025-07-31

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

To address the degradation of classification robustness in fine-tuning foundation models under label noise, this paper proposes a two-stage, geometry-aware reliability modeling framework that requires no retraining. First, it constructs a local neighborhood graph using the Non-Negative Kernel (NNK), mitigating the sensitivity of conventional k-NN to distance metrics and neighborhood size. Second, it introduces a noise-robust reliability estimator that performs adaptive weighted inference over the k-NN graph, effectively accommodating diverse noise patterns—including symmetric and asymmetric label noise. The method operates entirely on frozen, pre-trained embeddings without modifying model parameters. Experiments on CIFAR-10 and DermaMNIST demonstrate that our approach significantly outperforms standard k-NN and state-of-the-art adaptive neighborhood methods, achieving superior and more stable classification accuracy under various label noise settings.

Technology Category

Application Category

📝 Abstract

Foundation models (FMs) pretrained on large datasets have become fundamental for various downstream machine learning tasks, in particular in scenarios where obtaining perfectly labeled data is prohibitively expensive. In this paper, we assume an FM has to be fine-tuned with noisy data and present a two-stage framework to ensure robust classification in the presence of label noise without model retraining. Recent work has shown that simple k-nearest neighbor (kNN) approaches using an embedding derived from an FM can achieve good performance even in the presence of severe label noise. Our work is motivated by the fact that these methods make use of local geometry. In this paper, following a similar two-stage procedure, reliability estimation followed by reliability-weighted inference, we show that improved performance can be achieved by introducing geometry information. For a given instance, our proposed inference uses a local neighborhood of training data, obtained using the non-negative kernel (NNK) neighborhood construction. We propose several methods for reliability estimation that can rely less on distance and local neighborhood as the label noise increases. Our evaluation on CIFAR-10 and DermaMNIST shows that our methods improve robustness across various noise conditions, surpassing standard K-NN approaches and recent adaptive-neighborhood baselines.

Problem

Research questions and friction points this paper is trying to address.

Robust classification under noisy label conditions

Geometry-aware reliability framework for foundation models

Improving performance without model retraining

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometry-aware reliability framework for noisy labels

Non-negative kernel neighborhood construction method

Distance-independent reliability estimation techniques

🔎 Similar Papers

No similar papers found.