Decoupling Continual Semantic Segmentation

๐Ÿ“… 2025-08-07
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Continual Semantic Segmentation (CSS) suffers from an inherent trade-off between catastrophic forgetting and learning novel classes. To address this, we propose the first decoupled continual learning framework that separately models class-aware object detection and class-agnostic pixel-level segmentation. Specifically, we employ LoRA to fine-tune a vision-language encoder, generating location-aware, class-specific text prompts that guide the Segment Anything Model (SAM) to produce generic segmentation masks. Object localization is handled by a dedicated detection module, while semantic decoding is delegated to a decoupled segmentation moduleโ€”enabling cross-task knowledge sharing. This design effectively mitigates inter-class interference, achieving state-of-the-art performance across multiple CSS benchmarks. Our method simultaneously ensures high retention of previously learned classes and strong scalability to emerging categories, establishing a novel paradigm for continual learning in dense prediction tasks.

Technology Category

Application Category

๐Ÿ“ Abstract
Continual Semantic Segmentation (CSS) requires learning new classes without forgetting previously acquired knowledge, addressing the fundamental challenge of catastrophic forgetting in dense prediction tasks. However, existing CSS methods typically employ single-stage encoder-decoder architectures where segmentation masks and class labels are tightly coupled, leading to interference between old and new class learning and suboptimal retention-plasticity balance. We introduce DecoupleCSS, a novel two-stage framework for CSS. By decoupling class-aware detection from class-agnostic segmentation, DecoupleCSS enables more effective continual learning, preserving past knowledge while learning new classes. The first stage leverages pre-trained text and image encoders, adapted using LoRA, to encode class-specific information and generate location-aware prompts. In the second stage, the Segment Anything Model (SAM) is employed to produce precise segmentation masks, ensuring that segmentation knowledge is shared across both new and previous classes. This approach improves the balance between retention and adaptability in CSS, achieving state-of-the-art performance across a variety of challenging tasks. Our code is publicly available at: https://github.com/euyis1019/Decoupling-Continual-Semantic-Segmentation.
Problem

Research questions and friction points this paper is trying to address.

Addresses catastrophic forgetting in continual semantic segmentation
Decouples class detection and segmentation to reduce interference
Improves retention-plasticity balance in learning new and old classes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage framework decouples detection and segmentation
Uses LoRA-adapted text and image encoders
Employs SAM for precise class-agnostic segmentation
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yifu Guo
Sun Yat-sen University, South China Normal University
Y
Yuquan Lu
Sun Yat-sen University, South China Normal University
Wentao Zhang
Wentao Zhang
Institute of Physics, Chinese Academy of Sciences
photoemissionsuperconductivitycupratehtsctime-resolved
Zishan Xu
Zishan Xu
Tsinghua University
D
Dexia Chen
Sun Yat-sen University
Siyu Zhang
Siyu Zhang
4DV.ai
Computer Vision
Y
Yizhe Zhang
Nanjing University of Science and Technology
R
Ruixuan Wang
Sun Yat-sen University