🤖 AI Summary
This work addresses the lack of clinically grounded, standardized scenarios and comprehensive evaluation in existing continual learning research for medical image segmentation, which has disproportionately emphasized catastrophic forgetting while neglecting critical attributes such as plasticity. To bridge this gap, the study introduces three novel continual learning scenarios—Domain-CL, Class-CL, and Organ-CL—explicitly designed to reflect real-world clinical demands. It further establishes a multidimensional benchmark encompassing performance, forgetting, plasticity, forward transfer, parameter efficiency, and replay burden. Systematic experiments across multiple medical imaging datasets reveal that replay-based strategies achieve the best trade-off between stability and plasticity, whereas parameter isolation methods mitigate forgetting at the cost of increased model complexity. The study also highlights forward transfer capability as a significant and underexplored challenge in the field.
📝 Abstract
Continual learning (CL) is essential for deploying medical image segmentation models in clinical environments where imaging domains, anatomical targets, and diagnostic tasks evolve over time. However, continual segmentation still faces three main challenges. First, the scenarios for this task remain insufficiently standardized for real-world clinical settings. Second, existing research has been primarily focused on mitigating forgetting, overlooking the other essential properties such as plasticity. Third, a benchmark work with comprehensive evaluation on existing methods is stll desirable. To address these gaps, we present such benchmark study of continual medical image segmentation. We first define three clinically motivated scenarios, namely Domain-CL, Class-CL, and Organ-CL, to respectively capture the cross-center domain shift, the incremental anatomical structure segmentation, and the cross-organ segmentation. We then introduce an evaluation framework that measures not only general performance and forgetting, but also plasticity, forward generalizability, parameter efficiency, and replay burden. The results, from extensive experiments with representative CL methods, showed that it was still challenging to develop a model that could satisfy all the requirements simultaneously. Nevertheless, these studies also suggested that the replay-based methods achieve the best overall balance between stability and plasticity, the parameter-isolation methods should be effective at reducing forgetting, though at the cost of increased model size, and the forward generalizability remain a significantly understudied aspect of this research field. Finally, we discuss related learning paradigms and outline future directions for continual medical image segmentation.