Progressive Growing of Patch Size: Curriculum Learning for Accelerated and Improved Medical Image Segmentation

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address inefficiency in training, sensitivity to class imbalance, and limited generalization caused by fixed patch sizes in 3D medical image segmentation, this paper proposes a progressive patch-size enlargement strategy based on automatic curriculum learning. The method dynamically adjusts input patch dimensions during training—without manual curriculum design—and is compatible with diverse architectures including UNet, UNETR, and SwinUNETR. Evaluated across 15 public benchmarks, it achieves two operational modes: under *resource mode*, training time reduces to 44% of baseline while maintaining performance; under *performance mode*, mean Dice improves by 1.28% with only 89% training time, yielding greater gains for long-tail tasks such as lesion segmentation. The core innovation lies in the first formulation of patch-size growth as an adaptive curriculum learning process, which accelerates convergence, mitigates class imbalance, significantly reduces result variance, and enhances cross-task comparability.

Technology Category

Application Category

📝 Abstract

In this work, we introduce Progressive Growing of Patch Size, an automatic curriculum learning approach for 3D medical image segmentation. Our approach progressively increases the patch size during model training, resulting in an improved class balance for smaller patch sizes and accelerated convergence of the training process. We evaluate our curriculum approach in two settings: a resource-efficient mode and a performance mode, both regarding Dice score performance and computational costs across 15 diverse and popular 3D medical image segmentation tasks. The resource-efficient mode matches the Dice score performance of the conventional constant patch size sampling baseline with a notable reduction in training time to only 44%. The performance mode improves upon constant patch size segmentation results, achieving a statistically significant relative mean performance gain of 1.28% in Dice Score. Remarkably, across all 15 tasks, our proposed performance mode manages to surpass the constant patch size baseline in Dice Score performance, while simultaneously reducing training time to only 89%. The benefits are particularly pronounced for highly imbalanced tasks such as lesion segmentation tasks. Rigorous experiments demonstrate that our performance mode not only improves mean segmentation performance but also reduces performance variance, yielding more trustworthy model comparison. Furthermore, our findings reveal that the proposed curriculum sampling is not tied to a specific architecture but represents a broadly applicable strategy that consistently boosts performance across diverse segmentation models, including UNet, UNETR, and SwinUNETR. In summary, we show that this simple yet elegant transformation on input data substantially improves both Dice Score performance and training runtime, while being compatible across diverse segmentation backbones.

Problem

Research questions and friction points this paper is trying to address.

Improves medical image segmentation via progressive patch size growth

Accelerates training convergence while enhancing class balance

Achieves higher Dice scores with reduced computational costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressively increases patch size during training

Improves class balance and accelerates convergence

Applicable across diverse segmentation model architectures

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Multimodal Model Training and Inference Optimization Engineer

ByteDance

圣何塞

Authors to Follow