๐ค AI Summary
Manual diagnosis of spread through air spaces (STAS) in lung adenocarcinoma (LUAD) across multi-center pathology images is labor-intensive and prone to high false-negative rates. Method: We propose a multimodal attention-aware multiple instance learning (MIL) framework featuring a dual-branch architecture with Transformer-based instance encoding to jointly model global context and local discriminative regions; a dynamic multimodal attention aggregation module selectively identifies critical pathological regions, while cross-branch similarity regularization mitigates feature redundancy. Results: Evaluated on three independent multi-center datasets, our model achieves AUCs of 0.8058, 0.8017, and 0.7928โsignificantly surpassing the average diagnostic performance of pathologists. This work is the first to integrate multimodal attention mechanisms and cross-branch regularization into the MIL paradigm for STAS detection, thereby enhancing discriminative capability, robustness, and clinical interpretability.
๐ Abstract
Spread through air spaces (STAS) constitutes a novel invasive pattern in lung adenocarcinoma (LUAD), associated with tumor recurrence and diminished survival rates. However, large-scale STAS diagnosis in LUAD remains a labor-intensive endeavor, compounded by the propensity for oversight and misdiagnosis due to its distinctive pathological characteristics and morphological features. Consequently, there is a pressing clinical imperative to leverage deep learning models for STAS diagnosis. This study initially assembled histopathological images from STAS patients at the Second Xiangya Hospital and the Third Xiangya Hospital of Central South University, alongside the TCGA-LUAD cohort. Three senior pathologists conducted cross-verification annotations to construct the STAS-SXY, STAS-TXY, and STAS-TCGA datasets. We then propose a multi-pattern attention-aware multiple instance learning framework, named STAMP, to analyze and diagnose the presence of STAS across multi-center histopathology images. Specifically, the dual-branch architecture guides the model to learn STAS-associated pathological features from distinct semantic spaces. Transformer-based instance encoding and a multi-pattern attention aggregation modules dynamically selects regions closely associated with STAS pathology, suppressing irrelevant noise and enhancing the discriminative power of global representations. Moreover, a similarity regularization constraint prevents feature redundancy across branches, thereby improving overall diagnostic accuracy. Extensive experiments demonstrated that STAMP achieved competitive diagnostic results on STAS-SXY, STAS-TXY and STAS-TCGA, with AUCs of 0.8058, 0.8017, and 0.7928, respectively, surpassing the clinical level.