Tumor-aware augmentation with task-guided attention analysis improves rectal cancer segmentation from magnetic resonance images

📅 2026-05-06
📈 Citations: 0
Influential: 0
📄 PDF

career value

165K/year
🤖 AI Summary
This study addresses the significant performance degradation in cross-modal transfer from CT to MRI segmentation, where pretrained Transformers suffer from attention dispersion and ineffective feature adaptation due to zero-padding. The authors identify that this deterioration stems from geometric mismatch in input structures and the transfer of invalid features. To mitigate these issues, they propose a targeted intervention strategy: introducing the Attention Dilution Index (ADI) to quantitatively assess interference from padding tokens, combined with tumor-aware data augmentation and anisotropic cropping to enhance cross-modal robustness. Evaluated on rectal MRI data, their approach achieves tumor detection rates of 90.7% and 88.7% using SMIT and Swin UNETR backbones, respectively, substantially outperforming baseline methods.
📝 Abstract
Pretraining on large-scale datasets has been shown to improve transformer generalizability, even for out-of-domain (OOD) modalities and tasks. However, two common assumptions often fail under OOD transfer: that downstream datasets can be adapted to the fixed input geometry of pretrained models and that pretrained representations transfer effectively across imaging modalities. We show that these assumptions break down through two interacting failure modes in CT-to-MRI transfer: inefficient token usage caused by zero-padding to match pretrained input dimensions and ineffective feature adaptation. These failures led to accuracy degradation despite extensive fine-tuning. We investigated these failure modes using two CT-pretrained hierarchical shifted-window transformer backbones, SMIT and Swin UNETR, pretrained with different objectives and datasets. Mechanistic analysis introduced an attention dilution index (ADI), an entropy-based metric quantifying attention diverted toward uninformative padding tokens, and centered kernel alignment (CKA) to measure feature reuse in MRI tasks. ADI increased with zero-padding, while high feature reuse did not necessarily correspond to improved accuracy. To mitigate these issues, we introduced two interventions: a tumor-aware augmentation strategy to improve tumor appearance heterogeneity coverage and an anisotropic cropping strategy to restore token efficiency. Fine-tuning on identical rectal MRI datasets improved detection rates to 224/247 (90.7%) for SMIT and 219/247 (88.7%) for Swin UNETR, demonstrating improved robustness under CT-to-MRI transfer. This study is among the first to examine when pretrained transformers fail to transfer effectively across imaging modalities and how simple mitigation strategies, motivated by mechanistic analysis of datasets, can reduce transfer limitations while improving robustness and MRI detection.
Problem

Research questions and friction points this paper is trying to address.

out-of-domain transfer
cross-modality generalization
rectal cancer segmentation
pretrained transformers
MRI
Innovation

Methods, ideas, or system contributions that make the work stand out.

tumor-aware augmentation
attention dilution index
anisotropic cropping
cross-modality transfer
mechanistic analysis
Aneesh Rangnekar
Aneesh Rangnekar
Research Fellow, Memorial Sloan Kettering Cancer Center
self-supervised learningsemi-supervised learningactive learningmedical imagingremote sensing
J
Joao Miranda
Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
N
Natally Horvat
Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
S
Stephanie Chahwan
Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
S
Samir Alrayess
Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
A
Aditya Apte
Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
A
Aditi Iyer
Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
E
Eve LoCastro
Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
R
Revathi Ravella
Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
M
Marc J Gollub
Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
I
Iva Petkovska
Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
J
Jesse Joshua Smith
Department of Surgery, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
P
Paul Romesser
Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
J
Julio Garcia-Aguilar
Department of Surgery, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA
Harini Veeraraghavan
Harini Veeraraghavan
Associate Attending Computer Scientist, Memorial Sloan-Kettering Cancer Center
Multi-modality analysisimage segmentationdeep/machine learningimage registrationradiomics
J
Joseph Deasy
Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 10065, USA