Revisiting Data Scaling Law for Medical Segmentation

📅 2025-11-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Scaling laws between data volume and model performance in medical anatomical segmentation remain underexplored, particularly across multi-task and multi-modal settings; conventional data scaling is hindered by high annotation costs and deformation-induced anatomical distortions. Method: We propose a registration-guided geodesic subspace modeling framework that generates topology-preserving, diffeomorphic deformations to enhance anatomical fidelity in synthetic data augmentation. Contribution/Results: Across 15 semantic segmentation tasks and 4 imaging modalities, we empirically validate a power-law relationship between dataset size and Dice score. Our scalable augmentation strategy significantly improves data efficiency—accelerating model convergence and surpassing prior power-law ceilings—without requiring additional annotated data. Experiments demonstrate consistent performance gains in multi-task, multi-modal segmentation, while reducing both annotation burden and computational overhead.

Technology Category

Application Category

📝 Abstract
The population loss of trained deep neural networks often exhibits power law scaling with the size of the training dataset, guiding significant performance advancements in deep learning applications. In this study, we focus on the scaling relationship with data size in the context of medical anatomical segmentation, a domain that remains underexplored. We analyze scaling laws for anatomical segmentation across 15 semantic tasks and 4 imaging modalities, demonstrating that larger datasets significantly improve segmentation performance, following similar scaling trends. Motivated by the topological isomorphism in images sharing anatomical structures, we evaluate the impact of deformation-guided augmentation strategies on data scaling laws, specifically random elastic deformation and registration-guided deformation. We also propose a novel, scalable image augmentation approach that generates diffeomorphic mappings from geodesic subspace based on image registration to introduce realistic deformation. Our experimental results demonstrate that both registered and generated deformation-based augmentation considerably enhance data utilization efficiency. The proposed generated deformation method notably achieves superior performance and accelerated convergence, surpassing standard power law scaling trends without requiring additional data. Overall, this work provides insights into the understanding of segmentation scalability and topological variation impact in medical imaging, thereby leading to more efficient model development with reduced annotation and computational costs.
Problem

Research questions and friction points this paper is trying to address.

Investigating data scaling laws for medical anatomical segmentation across multiple tasks
Evaluating deformation-based augmentation strategies to improve data utilization efficiency
Proposing novel image augmentation method to enhance performance without additional data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposed scalable diffeomorphic mapping augmentation method
Evaluated deformation-guided augmentation strategies impact
Enhanced data utilization efficiency without additional data
🔎 Similar Papers
No similar papers found.
Y
Yuetan Chu
Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Zhongyi Han
Zhongyi Han
Professor, Shandong University
Machine LearningAgentic AIAI for Science
G
Gongning Luo
Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
X
Xin Gao
Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia