DSAGL: Dual-Stream Attention-Guided Learning for Weakly Supervised Whole Slide Image Classification

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

In weakly supervised classification of whole-slide images (WSIs), the extreme image resolution and scarcity of fine-grained annotations lead to instance ambiguity and bag-level semantic inconsistency. To address these challenges, we propose a dual-stream attention-guided framework. Our method introduces a novel multi-scale attention-based pseudo-label generation mechanism, employs a lightweight shared VSSMamba encoder to model long-range dependencies, and incorporates a Fusion Attention and Semantic Alignment (FASA) module for cross-stream feature co-optimization. Furthermore, we design a dual-stream mutual consistency hybrid loss and establish an end-to-end teacher–student collaborative training paradigm. Evaluated on CIFAR-10, NCT-CRC, and TCGA-Lung datasets, our approach consistently outperforms state-of-the-art multiple-instance learning (MIL) models, achieving superior classification accuracy and enhanced robustness under weak supervision.

Technology Category

Application Category

📝 Abstract

Whole-slide images (WSIs) are critical for cancer diagnosis due to their ultra-high resolution and rich semantic content. However, their massive size and the limited availability of fine-grained annotations pose substantial challenges for conventional supervised learning. We propose DSAGL (Dual-Stream Attention-Guided Learning), a novel weakly supervised classification framework that combines a teacher-student architecture with a dual-stream design. DSAGL explicitly addresses instance-level ambiguity and bag-level semantic consistency by generating multi-scale attention-based pseudo labels and guiding instance-level learning. A shared lightweight encoder (VSSMamba) enables efficient long-range dependency modeling, while a fusion-attentive module (FASA) enhances focus on sparse but diagnostically relevant regions. We further introduce a hybrid loss to enforce mutual consistency between the two streams. Experiments on CIFAR-10, NCT-CRC, and TCGA-Lung datasets demonstrate that DSAGL consistently outperforms state-of-the-art MIL baselines, achieving superior discriminative performance and robustness under weak supervision.

Problem

Research questions and friction points this paper is trying to address.

Classifying whole-slide images with weak supervision

Addressing instance-level ambiguity in cancer diagnosis

Enhancing focus on diagnostically relevant regions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-stream design for weakly supervised learning

Attention-guided multi-scale pseudo label generation

Lightweight encoder with long-range dependency modeling

🔎 Similar Papers

Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism