🤖 AI Summary
This study addresses the limited generalizability of deep learning models trained on a single MRI sequence to other sequences, a key barrier to clinical deployment. To this end, the authors construct a multi-sequence pancreatic MRI benchmark dataset comprising 1,386 3D scans from eight centers and systematically evaluate cross-sequence generalization across T1-weighted, T2-weighted, and opposed-phase sequences. The work reveals, for the first time, that inter-sequence domain shifts substantially exceed inter-center variations and constitute the primary bottleneck to generalization. It further demonstrates that existing domain generalization methods largely fail under physics-driven contrast inversions, whereas medical foundation models such as MedSAM2 achieve moderate zero-shot transfer performance by leveraging shape priors. Semi-supervised learning proves effective only when intensity distributions remain stable across domains.
📝 Abstract
Automatic pancreas segmentation is fundamental to abdominal MRI analysis, yet deep learning models trained on one MRI sequence often fail catastrophically when applied to another-a challenge that has received little systematic investigation. We introduce CrossPan, a multi-institutional benchmark comprising 1,386 3D scans across three routinely acquired sequences (T1-weighted, T2-weighted, and Out-of-Phase) from eight centers. Our experiments reveal three key findings. First, cross-sequence domain shifts are far more severe than cross-center variability: models achieving Dice scores above 0.85 in-domain collapse to near-zero (<0.02) when transferred across sequences. Second, state-of-the-art domain generalization methods provide negligible benefit under these physics-driven contrast inversions, whereas foundation models like MedSAM2 maintain moderate zero-shot performance through contrast-invariant shape priors. Third, semi-supervised learning offers gains only under stable intensity distributions and becomes unstable on sequences with high intra-organ variability. These results establish cross-sequence generalization-not model architecture or center diversity-as the primary barrier to clinically deployable pancreas MRI segmentation. Dataset and code are available at https://crosspan.netlify.app/.