Active Learning with a Noisy Annotator

📅 2025-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Under low-labeling-budget settings, noisy labels severely degrade active learning performance and cause sample coverage imbalance. To address this, we propose the Noise-Aware Active Sampling (NAS) framework. NAS is the first to integrate coverage-based active learning with a noise-driven resampling mechanism, incorporating a lightweight endogenous noise filtering module and a noise-sensitive region identification mechanism to enhance sampling robustness. Built upon a greedy coverage strategy, NAS prioritizes high-informativeness, low-noise samples. Experiments on CIFAR-100 and an ImageNet subset demonstrate that NAS consistently improves the performance of multiple state-of-the-art active learning methods. Notably, NAS exhibits strong robustness against both symmetric and asymmetric label noise across varying noise rates. Its design effectively mitigates the adverse impact of label corruption while preserving coverage diversity, thereby enabling reliable model training under realistic, resource-constrained, and noisy labeling scenarios.

Technology Category

Application Category

📝 Abstract
Active Learning (AL) aims to reduce annotation costs by strategically selecting the most informative samples for labeling. However, most active learning methods struggle in the low-budget regime where only a few labeled examples are available. This issue becomes even more pronounced when annotators provide noisy labels. A common AL approach for the low- and mid-budget regimes focuses on maximizing the coverage of the labeled set across the entire dataset. We propose a novel framework called Noise-Aware Active Sampling (NAS) that extends existing greedy, coverage-based active learning strategies to handle noisy annotations. NAS identifies regions that remain uncovered due to the selection of noisy representatives and enables resampling from these areas. We introduce a simple yet effective noise filtering approach suitable for the low-budget regime, which leverages the inner mechanism of NAS and can be applied for noise filtering before model training. On multiple computer vision benchmarks, including CIFAR100 and ImageNet subsets, NAS significantly improves performance for standard active learning methods across different noise types and rates.
Problem

Research questions and friction points this paper is trying to address.

Active Learning struggles with limited labeled noisy data
Noisy annotations reduce coverage in low-budget AL regimes
Proposing Noise-Aware Active Sampling to handle annotation noise
Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise-Aware Active Sampling handles noisy annotations
Resamples uncovered regions from noisy representatives
Simple noise filtering for low-budget regimes
🔎 Similar Papers
No similar papers found.