🤖 AI Summary
This study addresses the challenge of generalizing automatic segmentation of the gross tumor volume (GTV) and lymph node clinical target volume (LN CTV) in nasopharyngeal carcinoma radiotherapy across multi-center, multi-modal CT imaging. It establishes the first multi-center benchmark encompassing both non-contrast and contrast-enhanced CT scans and introduces a clinically realistic mixed-data setting involving single- and dual-modality inputs. Within a unified deep learning framework, the robustness of multiple participating models is systematically evaluated under cross-center and cross-modality conditions. Results show that GTV segmentation achieves Dice similarity coefficients (DSC) of up to 74.61% and 56.79% on internal and external test sets, respectively, while LN CTV segmentation yields DSCs of 60.24%, 60.50%, and 57.23% on paired CT, contrast-enhanced-only, and non-contrast-only subsets, respectively—providing the first comprehensive insight into the generalization limitations of current methods in complex clinical settings.
📝 Abstract
Accurate delineation of Gross Tumor Volume (GTV), Lymph Node Clinical Target Volume (LN CTV), and Organ-at-Risk (OAR) from Computed Tomography (CT) scans is essential for precise radiotherapy planning in Nasopharyngeal Carcinoma (NPC). Building upon SegRap2023, which focused on OAR and GTV segmentation using single-center paired non-contrast CT (ncCT) and contrast-enhanced CT (ceCT) scans, the SegRap2025 challenge aims to enhance the generalizability and robustness of segmentation models across imaging centers and modalities. SegRap2025 comprises two tasks: Task01 addresses GTV segmentation using paired CT from the SegRap2023 dataset, with an additional external testing set to evaluate cross-center generalization, and Task02 focuses on LN CTV segmentation using multi-center training data and an unseen external testing set, where each case contains paired CT scans or a single modality, emphasizing both cross-center and cross-modality robustness. This paper presents the challenge setup and provides a comprehensive analysis of the solutions submitted by ten participating teams. For GTV segmentation task, the top-performing models achieved average Dice Similarity Coefficient (DSC) of 74.61% and 56.79% on the internal and external testing cohorts, respectively. For LN CTV segmentation task, the highest average DSC values reached 60.24%, 60.50%, and 57.23% on paired CT, ceCT-only, and ncCT-only subsets, respectively. SegRap2025 establishes a large-scale multi-center, multi-modality benchmark for evaluating the generalization and robustness in radiotherapy target segmentation, providing valuable insights toward clinically applicable automated radiotherapy planning systems. The benchmark is available at: https://hilab-git.github.io/SegRap2025_Challenge.