DSAGL: Dual-Stream Attention-Guided Learning for Weakly Supervised Whole Slide Image Classification

๐Ÿ“… 2025-05-29
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In weakly supervised classification of whole-slide images (WSIs), the extreme image resolution and scarcity of fine-grained annotations lead to instance ambiguity and bag-level semantic inconsistency. To address these challenges, we propose a dual-stream attention-guided framework. Our method introduces a novel multi-scale attention-based pseudo-label generation mechanism, employs a lightweight shared VSSMamba encoder to model long-range dependencies, and incorporates a Fusion Attention and Semantic Alignment (FASA) module for cross-stream feature co-optimization. Furthermore, we design a dual-stream mutual consistency hybrid loss and establish an end-to-end teacherโ€“student collaborative training paradigm. Evaluated on CIFAR-10, NCT-CRC, and TCGA-Lung datasets, our approach consistently outperforms state-of-the-art multiple-instance learning (MIL) models, achieving superior classification accuracy and enhanced robustness under weak supervision.

Technology Category

Application Category

๐Ÿ“ Abstract
Whole-slide images (WSIs) are critical for cancer diagnosis due to their ultra-high resolution and rich semantic content. However, their massive size and the limited availability of fine-grained annotations pose substantial challenges for conventional supervised learning. We propose DSAGL (Dual-Stream Attention-Guided Learning), a novel weakly supervised classification framework that combines a teacher-student architecture with a dual-stream design. DSAGL explicitly addresses instance-level ambiguity and bag-level semantic consistency by generating multi-scale attention-based pseudo labels and guiding instance-level learning. A shared lightweight encoder (VSSMamba) enables efficient long-range dependency modeling, while a fusion-attentive module (FASA) enhances focus on sparse but diagnostically relevant regions. We further introduce a hybrid loss to enforce mutual consistency between the two streams. Experiments on CIFAR-10, NCT-CRC, and TCGA-Lung datasets demonstrate that DSAGL consistently outperforms state-of-the-art MIL baselines, achieving superior discriminative performance and robustness under weak supervision.
Problem

Research questions and friction points this paper is trying to address.

Classifying whole-slide images with weak supervision
Addressing instance-level ambiguity in cancer diagnosis
Enhancing focus on diagnostically relevant regions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-stream design for weakly supervised learning
Attention-guided multi-scale pseudo label generation
Lightweight encoder with long-range dependency modeling
๐Ÿ”Ž Similar Papers
No similar papers found.
D
Daoxi Cao
College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, Shanxi, 030024, China
Hangbei Cheng
Hangbei Cheng
Taiyuan University of Technology
computer vision
Yijin Li
Yijin Li
State Key Lab of CAD&CG, Zhejiang University, China
Computer Vision
Ruolin Zhou
Ruolin Zhou
College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, Shanxi, 030024, China
X
Xinyi Li
College of Artificial Intelligence, Taiyuan University of Technology, Taiyuan, Shanxi, 030024, China
X
Xuehan Zhang
College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, Shanxi, 030024, China
B
Binwei Li
College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, Shanxi, 030024, China
X
Xuancheng Gu
School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, 102206, China
Xueyu Liu
Xueyu Liu
Taiyuan University of Technology
Deep LearningMedical Image Analysis
Yongfei Wu
Yongfei Wu
Taiyuan University of Technology
Computer Vision