🤖 AI Summary
To address the challenges of large-scale, irregularly shaped whole-slide images (WSIs) and difficulty in modeling multi-granularity histopathological patterns for cancer survival prediction, this paper proposes a multi-scale cross-attention convolutional fusion framework. The method employs multi-magnification patch sampling and introduces a cross-scale cross-attention mechanism to explicitly capture fine-grained interactions between cellular-level abnormalities and tissue-level architectural patterns. It further incorporates lightweight convolutional feature fusion and integrates WSI-level Cox regression for survival risk modeling. Evaluated on six public cancer datasets, the framework significantly outperforms existing state-of-the-art methods. Performance improves consistently when integrating pathology-specific backbone networks. The source code is publicly available.
📝 Abstract
Cancer survival prediction from whole slide images (WSIs) is a challenging task in computational pathology due to the large size, irregular shape, and high granularity of the WSIs. These characteristics make it difficult to capture the full spectrum of patterns, from subtle cellular abnormalities to complex tissue interactions, which are crucial for accurate prognosis. To address this, we propose CrossFusion, a novel multi-scale feature integration framework that extracts and fuses information from patches across different magnification levels. By effectively modeling both scale-specific patterns and their interactions, CrossFusion generates a rich feature set that enhances survival prediction accuracy. We validate our approach across six cancer types from public datasets, demonstrating significant improvements over existing state-of-the-art methods. Moreover, when coupled with domain-specific feature extraction backbones, our method shows further gains in prognostic performance compared to general-purpose backbones. The source code is available at: https://github.com/RustinS/CrossFusion