SMILE: a Scale-aware Multiple Instance Learning Method for Multicenter STAS Lung Cancer Histopathology Diagnosis

📅 2025-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the time-consuming, subjective, and poorly reproducible manual assessment of Spread Through Air Spaces (STAS) in lung cancer histopathological diagnosis, this paper introduces the first multi-center, multi-cohort public STAS dataset—comprising CSU, TCGA, and CPTAC cohorts—and proposes SMILE, a scale-aware multiple-instance learning framework. Its core innovation is a novel scale-adaptive attention mechanism that mitigates local overfitting caused by STAS lesion sparsity, heterogeneity, and multi-scale variability. Additionally, we establish the first open-source STAS benchmark, encompassing 11 baseline methods. Experiments demonstrate that SMILE achieves an AUC surpassing the average diagnostic performance of pathologists on the CSU test set and accurately identifies STAS-positive cases in both CPTAC (n=251) and TCGA (n=319). All code and datasets are publicly released.

Technology Category

Application Category

📝 Abstract
Spread through air spaces (STAS) represents a newly identified aggressive pattern in lung cancer, which is known to be associated with adverse prognostic factors and complex pathological features. Pathologists currently rely on time consuming manual assessments, which are highly subjective and prone to variation. This highlights the urgent need for automated and precise diag nostic solutions. 2,970 lung cancer tissue slides are comprised from multiple centers, re-diagnosed them, and constructed and publicly released three lung cancer STAS datasets: STAS CSU (hospital), STAS TCGA, and STAS CPTAC. All STAS datasets provide corresponding pathological feature diagnoses and related clinical data. To address the bias, sparse and heterogeneous nature of STAS, we propose an scale-aware multiple instance learning(SMILE) method for STAS diagnosis of lung cancer. By introducing a scale-adaptive attention mechanism, the SMILE can adaptively adjust high attention instances, reducing over-reliance on local regions and promoting consistent detection of STAS lesions. Extensive experiments show that SMILE achieved competitive diagnostic results on STAS CSU, diagnosing 251 and 319 STAS samples in CPTAC andTCGA,respectively, surpassing clinical average AUC. The 11 open baseline results are the first to be established for STAS research, laying the foundation for the future expansion, interpretability, and clinical integration of computational pathology technologies. The datasets and code are available at https://anonymous.4open.science/r/IJCAI25-1DA1.
Problem

Research questions and friction points this paper is trying to address.

Automates STAS lung cancer diagnosis using AI.
Reduces subjectivity in manual pathological assessments.
Improves detection consistency of STAS lesions.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scale-aware multiple instance learning method
Scale-adaptive attention mechanism
Publicly released STAS datasets
L
Liangrui Pan
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
X
Xiaoyu Li
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
Y
Yutao Dou
College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
Q
Qiya Song
College of Information Science and Engineering, Hunan Normal University, Changsha 410082, China
J
Jiadi Luo
Department of Pathology, The Second Xiangya Hospital, Central South University, Changsha, 410011, Hunan, China
Q
Qingchun Liang
Department of Pathology, The Second Xiangya Hospital, Central South University, Changsha, 410011, Hunan, China
Shaoliang Peng
Shaoliang Peng
Cheung Kong Professor, Hunan University
High Performance ComputingBig DataBioinformaticsAI