Decoupling Continual Semantic Segmentation

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Continual Semantic Segmentation (CSS) suffers from an inherent trade-off between catastrophic forgetting and learning novel classes. To address this, we propose the first decoupled continual learning framework that separately models class-aware object detection and class-agnostic pixel-level segmentation. Specifically, we employ LoRA to fine-tune a vision-language encoder, generating location-aware, class-specific text prompts that guide the Segment Anything Model (SAM) to produce generic segmentation masks. Object localization is handled by a dedicated detection module, while semantic decoding is delegated to a decoupled segmentation module—enabling cross-task knowledge sharing. This design effectively mitigates inter-class interference, achieving state-of-the-art performance across multiple CSS benchmarks. Our method simultaneously ensures high retention of previously learned classes and strong scalability to emerging categories, establishing a novel paradigm for continual learning in dense prediction tasks.

Technology Category

Application Category

📝 Abstract

Continual Semantic Segmentation (CSS) requires learning new classes without forgetting previously acquired knowledge, addressing the fundamental challenge of catastrophic forgetting in dense prediction tasks. However, existing CSS methods typically employ single-stage encoder-decoder architectures where segmentation masks and class labels are tightly coupled, leading to interference between old and new class learning and suboptimal retention-plasticity balance. We introduce DecoupleCSS, a novel two-stage framework for CSS. By decoupling class-aware detection from class-agnostic segmentation, DecoupleCSS enables more effective continual learning, preserving past knowledge while learning new classes. The first stage leverages pre-trained text and image encoders, adapted using LoRA, to encode class-specific information and generate location-aware prompts. In the second stage, the Segment Anything Model (SAM) is employed to produce precise segmentation masks, ensuring that segmentation knowledge is shared across both new and previous classes. This approach improves the balance between retention and adaptability in CSS, achieving state-of-the-art performance across a variety of challenging tasks. Our code is publicly available at: https://github.com/euyis1019/Decoupling-Continual-Semantic-Segmentation.

Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in continual semantic segmentation

Decouples class detection and segmentation to reduce interference

Improves retention-plasticity balance in learning new and old classes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage framework decouples detection and segmentation

Uses LoRA-adapted text and image encoders

Employs SAM for precise class-agnostic segmentation

🔎 Similar Papers

Deep Common Feature Mining for Efficient Video Semantic Segmentation