🤖 AI Summary
To address the challenge of sequentially emerging downstream tasks in medical image segmentation—where balancing knowledge retention and task adaptation remains difficult—this paper proposes a sequential progressive fine-tuning framework. Methodologically, it innovatively integrates maximum data similarity-based sample selection, LoRA-based low-rank adaptation, and feature-level knowledge distillation, augmented by data distribution alignment and loss landscape analysis to ensure training stability. Unlike parallel fine-tuning (task isolation) or multi-task fine-tuning (requiring full dataset access), our framework enables incremental task integration using only current-task data. Experiments across ten 3D medical segmentation benchmarks demonstrate an average Dice score improvement of 3.0%, significantly outperforming state-of-the-art continual learning methods. Moreover, the framework exhibits superior cross-task generalization capability on unseen tasks, validating its effectiveness in preserving prior knowledge while adapting to new tasks.
📝 Abstract
Foundation models have become a promising paradigm for advancing medical image analysis, particularly for segmentation tasks where downstream applications often emerge sequentially. Existing fine-tuning strategies, however, remain limited: parallel fine-tuning isolates tasks and fails to exploit shared knowledge, while multi-task fine-tuning requires simultaneous access to all datasets and struggles with incremental task integration. To address these challenges, we propose MedSeqFT, a sequential fine-tuning framework that progressively adapts pre-trained models to new tasks while refining their representational capacity. MedSeqFT introduces two core components: (1) Maximum Data Similarity (MDS) selection, which identifies downstream samples most representative of the original pre-training distribution to preserve general knowledge, and (2) Knowledge and Generalization Retention Fine-Tuning (K&G RFT), a LoRA-based knowledge distillation scheme that balances task-specific adaptation with the retention of pre-trained knowledge. Extensive experiments on two multi-task datasets covering ten 3D segmentation tasks demonstrate that MedSeqFT consistently outperforms state-of-the-art fine-tuning strategies, yielding substantial performance gains (e.g., an average Dice improvement of 3.0%). Furthermore, evaluations on two unseen tasks (COVID-19-20 and Kidney) verify that MedSeqFT enhances transferability, particularly for tumor segmentation. Visual analyses of loss landscapes and parameter variations further highlight the robustness of MedSeqFT. These results establish sequential fine-tuning as an effective, knowledge-retentive paradigm for adapting foundation models to evolving clinical tasks. Code will be released.