Enhancing WSI-Based Survival Analysis with Report-Auxiliary Self-Distillation

📅 2025-09-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In whole-slide image (WSI)-based survival analysis, challenges include high feature noise, scarcity of labeled data, and underutilization of patient-specific information embedded in pathology reports. To address these, we propose Rasa, a report-assisted self-distillation framework. Rasa leverages large language models to extract fine-grained semantic descriptions from noisy pathology text; employs a text-guided self-distillation mechanism to suppress irrelevant WSI features; and introduces a risk-aware mix-up strategy to enhance data diversity and improve modeling of the underlying risk distribution. Crucially, Rasa enables end-to-end multimodal alignment learning between WSIs and pathology reports. Evaluated on a curated colorectal cancer (CRC) dataset and the public TCGA-BRCA cohort, Rasa significantly outperforms state-of-the-art methods, demonstrating superior performance in cancer prognosis prediction and strong cross-cancer generalizability.

Technology Category

Application Category

📝 Abstract
Survival analysis based on Whole Slide Images (WSIs) is crucial for evaluating cancer prognosis, as they offer detailed microscopic information essential for predicting patient outcomes. However, traditional WSI-based survival analysis usually faces noisy features and limited data accessibility, hindering their ability to capture critical prognostic features effectively. Although pathology reports provide rich patient-specific information that could assist analysis, their potential to enhance WSI-based survival analysis remains largely unexplored. To this end, this paper proposes a novel Report-auxiliary self-distillation (Rasa) framework for WSI-based survival analysis. First, advanced large language models (LLMs) are utilized to extract fine-grained, WSI-relevant textual descriptions from original noisy pathology reports via a carefully designed task prompt. Next, a self-distillation-based pipeline is designed to filter out irrelevant or redundant WSI features for the student model under the guidance of the teacher model's textual knowledge. Finally, a risk-aware mix-up strategy is incorporated during the training of the student model to enhance both the quantity and diversity of the training data. Extensive experiments carried out on our collected data (CRC) and public data (TCGA-BRCA) demonstrate the superior effectiveness of Rasa against state-of-the-art methods. Our code is available at https://github.com/zhengwang9/Rasa.
Problem

Research questions and friction points this paper is trying to address.

Enhancing cancer prognosis via whole slide image survival analysis
Overcoming noisy features and limited data in WSI analysis
Leveraging pathology reports to improve prognostic feature extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Report-auxiliary self-distillation framework
LLM-extracted WSI-relevant textual descriptions
Risk-aware mix-up strategy training enhancement
🔎 Similar Papers
No similar papers found.
Z
Zheng Wang
School of Informatics, Xiamen University, Xiamen, China
H
Hong Liu
School of Informatics, Xiamen University, Xiamen, China
Z
Zheng Wang
School of Informatics, Xiamen University, Xiamen, China; Shanghai Innovation Institution, Shanghai, China
D
Danyi Li
Nanfang Hospital, Southern Medical University, Guangzhou, China
Min Cen
Min Cen
University of Science and Technology of China
B
Baptiste Magnier
EuroMov Digital Health in Motion, Univ Montpellier, IMT Mines Ales, Ales, France; Service de Médecine Nucléaire, Centre Hospitalier Universitaire de Nîmes, Université de Montpellier, Nîmes, France
Li Liang
Li Liang
The University of Western Australia
3D Point Cloud Processing3D Semantic Scene Completion3D Semantic Scene Generation
L
Liansheng Wang
School of Informatics, Xiamen University, Xiamen, China