🤖 AI Summary
This work addresses the limitations of existing whole-slide image analysis methods, which typically embed image patches into a homogeneous Euclidean space and struggle to simultaneously capture tissue hierarchical structure and local cellular heterogeneity. To overcome this, the study introduces, for the first time in computational pathology, a hyperbolic-Euclidean hybrid geometric embedding framework. This approach integrates Structured State Space (S4) models with a region-level Mixture-of-Experts (MoE) module within a multiple instance learning paradigm, enabling joint modeling of global architectural patterns and fine-grained local details. Evaluated across seven whole-slide datasets encompassing six cancer types, the proposed method consistently outperforms current state-of-the-art approaches, achieving superior performance on slide-level classification tasks.
📝 Abstract
Accurate analysis of histopathological images is critical for disease diagnosis and treatment planning. Whole-slide images (WSIs), which digitize tissue specimens at gigapixel resolution, are fundamental to this process but require aggregating thousands of patches for slide-level predictions. Multiple Instance Learning (MIL) tackles this challenge with a two-stage paradigm, decoupling tile-level embedding and slide-level prediction. However, most existing methods implicitly embed patch representations in homogeneous Euclidean spaces, overlooking the hierarchical organization and regional heterogeneity of pathological tissues. This limits current models' ability to capture global tissue architecture and fine-grained cellular morphology. To address this limitation, we introduce a hybrid hyperbolic-Euclidean representation that embeds WSI features in dual geometric spaces, enabling complementary modeling of hierarchical tissue structures and local morphological details. Building on this formulation, we develop BatMIL, a WSI classification framework that leverages both geometric spaces. To model long-range dependencies among thousands of patches, we employ a structured state space sequence model (S4) backbone that encodes patch sequences with linear computational complexity. Furthermore, to account for regional heterogeneity, we introduce a chunk-level mixture-of-experts (MoE) module that groups patches into regions and dynamically routes them to specialized subnetworks, improving representational capacity while reducing redundant computation. Extensive experiments on seven WSI datasets spanning six cancer types demonstrate that BatMIL consistently outperforms state-of-the-art MIL approaches in slide-level classification tasks. These results indicate that geometry-aware representation learning offers a promising direction for next-generation computational pathology.