ESA: Annotation-Efficient Active Learning for Semantic Segmentation

📅 2024-08-24

🏛️ arXiv.org

📈 Citations: 12

✨ Influential: 0

career value

161K/year

🤖 AI Summary

To address the high annotation cost in semantic segmentation and the neglect of image structural priors and pre-trained model capabilities in existing active learning methods, this paper proposes an entity-supervoxel collaborative active learning framework. Our method employs supervoxels as the fundamental annotation unit and introduces a class-agnostic entity-level annotation paradigm. It integrates a class-agnostic mask proposal network with an entropy-driven supervoxel selection mechanism to jointly enhance structural awareness and annotation efficiency. Evaluated on standard benchmarks, our approach achieves a 1.71% mIoU improvement over strong baselines using only 40 user clicks—reducing annotation effort by 98% compared to pixel-level methods. It significantly outperforms prior active learning approaches, effectively alleviating the annotation bottleneck while unlocking the full potential of pre-trained models.

Technology Category

Application Category

📝 Abstract

Active learning enhances annotation efficiency by selecting the most revealing samples for labeling, thereby reducing reliance on extensive human input. Previous methods in semantic segmentation have centered on individual pixels or small areas, neglecting the rich patterns in natural images and the power of advanced pre-trained models. To address these challenges, we propose three key contributions: Firstly, we introduce Entity-Superpixel Annotation (ESA), an innovative and efficient active learning strategy which utilizes a class-agnostic mask proposal network coupled with super-pixel grouping to capture local structural cues. Additionally, our method selects a subset of entities within each image of the target domain, prioritizing superpixels with high entropy to ensure comprehensive representation. Simultaneously, it focuses on a limited number of key entities, thereby optimizing for efficiency. By utilizing an annotator-friendly design that capitalizes on the inherent structure of images, our approach significantly outperforms existing pixel-based methods, achieving superior results with minimal queries, specifically reducing click cost by 98% and enhancing performance by 1.71%. For instance, our technique requires a mere 40 clicks for annotation, a stark contrast to the 5000 clicks demanded by conventional methods.

Problem

Research questions and friction points this paper is trying to address.

Improves annotation efficiency in semantic segmentation

Addresses neglect of natural patterns in prior methods

Reduces human input clicks by 98%

Innovation

Methods, ideas, or system contributions that make the work stand out.

Entity-Superpixel Annotation for efficient active learning

Class-agnostic mask proposal with super-pixel grouping

High-entropy superpixel prioritization for comprehensive representation

🔎 Similar Papers

No similar papers found.