SSMamba: A Self-Supervised Hybrid State Space Model for Pathological Image Classification

📅 2026-04-17
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
This work addresses three key challenges in region-of-interest (ROI) analysis of histopathological images: cross-magnification domain shift, insufficient modeling of local–global relationships, and inadequate sensitivity to fine-grained diagnostic cues. To tackle these issues, the authors propose SSMamba, a self-supervised hybrid state space model that employs a two-stage learning paradigm—self-supervised pretraining followed by supervised fine-tuning. The approach introduces Mamba-based Masked Image Modeling (MAMIM) to mitigate domain shift, incorporates a Directional Multi-Scale (DMS) module to balance local and global feature representation, and integrates a Local-Perceptive Residual (LPR) module to enhance sensitivity to subtle pathological patterns. Evaluated on 10 ROI and 6 whole-slide image (WSI) public datasets, SSMamba outperforms 11 pathology foundation models and 8 state-of-the-art methods, demonstrating the efficacy and superiority of its task-specific architecture.

Technology Category

Application Category

📝 Abstract
Pathological diagnosis is highly reliant on image analysis, where Regions of Interest (ROIs) serve as the primary basis for diagnostic evidence, while whole-slide image (WSI)-level tasks primarily capture aggregated patterns. To extract these critical morphological features, ROI-level Foundation Models (FMs) based on Vision Transformers (ViTs) and large-scale self-supervised learning (SSL) have been widely adopted. However, three core limitations remain in their application to ROI analysis: (1) cross-magnification domain shift, as fixed-scale pretraining hinders adaptation to diverse clinical settings; (2) inadequate local-global relationship modeling, wherein the ViT backbone of FMs suffers from high computational overhead and imprecise local characterization; (3) insufficient fine-grained sensitivity, as traditional self-attention mechanisms tend to overlook subtle diagnostic cues. To address these challenges, we propose SSMamba, a hybrid SSL framework that enables effective fine-grained feature learning without relying on large external datasets. This framework incorporates three domain-adaptive components: Mamba Masked Image Modeling (MAMIM) for mitigating domain shift, a Directional Multi-scale (DMS) module for balanced local-global modeling, and a Local Perception Residual (LPR) module for enhanced fine-grained sensitivity. Employing a two-stage pipeline, SSL pretraining on target ROI datasets followed by supervised fine-tuning (SFT), SSMamba outperforms 11 state-of-the-art (SOTA) pathological FMs on 10 public ROI datasets and surpasses 8 SOTA methods on 6 public WSI datasets. These results validate the superiority of task-specific architectural designs for pathological image analysis.
Problem

Research questions and friction points this paper is trying to address.

domain shift
local-global modeling
fine-grained sensitivity
pathological image classification
self-supervised learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

SSMamba
State Space Model
Self-Supervised Learning
Pathological Image Classification
Multi-scale Modeling
🔎 Similar Papers
No similar papers found.