๐ค AI Summary
Continual Semantic Segmentation (CSS) suffers from an inherent trade-off between catastrophic forgetting and learning novel classes. To address this, we propose the first decoupled continual learning framework that separately models class-aware object detection and class-agnostic pixel-level segmentation. Specifically, we employ LoRA to fine-tune a vision-language encoder, generating location-aware, class-specific text prompts that guide the Segment Anything Model (SAM) to produce generic segmentation masks. Object localization is handled by a dedicated detection module, while semantic decoding is delegated to a decoupled segmentation moduleโenabling cross-task knowledge sharing. This design effectively mitigates inter-class interference, achieving state-of-the-art performance across multiple CSS benchmarks. Our method simultaneously ensures high retention of previously learned classes and strong scalability to emerging categories, establishing a novel paradigm for continual learning in dense prediction tasks.
๐ Abstract
Continual Semantic Segmentation (CSS) requires learning new classes without forgetting previously acquired knowledge, addressing the fundamental challenge of catastrophic forgetting in dense prediction tasks. However, existing CSS methods typically employ single-stage encoder-decoder architectures where segmentation masks and class labels are tightly coupled, leading to interference between old and new class learning and suboptimal retention-plasticity balance. We introduce DecoupleCSS, a novel two-stage framework for CSS. By decoupling class-aware detection from class-agnostic segmentation, DecoupleCSS enables more effective continual learning, preserving past knowledge while learning new classes. The first stage leverages pre-trained text and image encoders, adapted using LoRA, to encode class-specific information and generate location-aware prompts. In the second stage, the Segment Anything Model (SAM) is employed to produce precise segmentation masks, ensuring that segmentation knowledge is shared across both new and previous classes. This approach improves the balance between retention and adaptability in CSS, achieving state-of-the-art performance across a variety of challenging tasks. Our code is publicly available at: https://github.com/euyis1019/Decoupling-Continual-Semantic-Segmentation.