Active Learning with a Noisy Annotator

📅 2025-04-06

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Under low-labeling-budget settings, noisy labels severely degrade active learning performance and cause sample coverage imbalance. To address this, we propose the Noise-Aware Active Sampling (NAS) framework. NAS is the first to integrate coverage-based active learning with a noise-driven resampling mechanism, incorporating a lightweight endogenous noise filtering module and a noise-sensitive region identification mechanism to enhance sampling robustness. Built upon a greedy coverage strategy, NAS prioritizes high-informativeness, low-noise samples. Experiments on CIFAR-100 and an ImageNet subset demonstrate that NAS consistently improves the performance of multiple state-of-the-art active learning methods. Notably, NAS exhibits strong robustness against both symmetric and asymmetric label noise across varying noise rates. Its design effectively mitigates the adverse impact of label corruption while preserving coverage diversity, thereby enabling reliable model training under realistic, resource-constrained, and noisy labeling scenarios.

Technology Category

Application Category

📝 Abstract

Active Learning (AL) aims to reduce annotation costs by strategically selecting the most informative samples for labeling. However, most active learning methods struggle in the low-budget regime where only a few labeled examples are available. This issue becomes even more pronounced when annotators provide noisy labels. A common AL approach for the low- and mid-budget regimes focuses on maximizing the coverage of the labeled set across the entire dataset. We propose a novel framework called Noise-Aware Active Sampling (NAS) that extends existing greedy, coverage-based active learning strategies to handle noisy annotations. NAS identifies regions that remain uncovered due to the selection of noisy representatives and enables resampling from these areas. We introduce a simple yet effective noise filtering approach suitable for the low-budget regime, which leverages the inner mechanism of NAS and can be applied for noise filtering before model training. On multiple computer vision benchmarks, including CIFAR100 and ImageNet subsets, NAS significantly improves performance for standard active learning methods across different noise types and rates.

Problem

Research questions and friction points this paper is trying to address.

Active Learning struggles with limited labeled noisy data

Noisy annotations reduce coverage in low-budget AL regimes

Proposing Noise-Aware Active Sampling to handle annotation noise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Noise-Aware Active Sampling handles noisy annotations

Resamples uncovered regions from noisy representatives

Simple noise filtering for low-budget regimes

🔎 Similar Papers

DIRECT: Deep Active Learning under Imbalance and Label Noise