🤖 AI Summary
In continual semantic segmentation (CSS), conventional image replay suffers from partial annotations, causing ambiguity between unlabeled classes and background—exacerbating background shift and catastrophic forgetting. To address this, we propose Enhanced Instance Replay (EIR): (1) the first instance-level storage of old-class regions, precisely extracted via instance segmentation masks; (2) a cross-image instance embedding fusion mechanism that jointly aligns background distributions in both old and new images; and (3) a background-aware loss that explicitly disentangles background semantics from unlabeled classes. Evaluated on multiple CSS benchmarks, EIR achieves state-of-the-art performance, improving average incremental mIoU by 3.2% over prior methods. It substantially mitigates catastrophic forgetting and enhances long-term class consistency across tasks.
📝 Abstract
In this work, we focus on continual semantic segmentation (CSS), where segmentation networks are required to continuously learn new classes without erasing knowledge of previously learned ones. Although storing images of old classes and directly incorporating them into the training of new models has proven effective in mitigating catastrophic forgetting in classification tasks, this strategy presents notable limitations in CSS. Specifically, the stored and new images with partial category annotations leads to confusion between unannotated categories and the background, complicating model fitting. To tackle this issue, this paper proposes a novel Enhanced Instance Replay (EIR) method, which not only preserves knowledge of old classes while simultaneously eliminating background confusion by instance storage of old classes, but also mitigates background shifts in the new images by integrating stored instances with new images. By effectively resolving background shifts in both stored and new images, EIR alleviates catastrophic forgetting in the CSS task, thereby enhancing the model's capacity for CSS. Experimental results validate the efficacy of our approach, which significantly outperforms state-of-the-art CSS methods.