π€ AI Summary
Current WSI analysis methods rely on Euclidean embeddings, which struggle to model the semantic hierarchy (patch β region β slide), limiting classification performance. To address this, we propose the first WSI semantic modeling framework integrating pathological text knowledge with hyperbolic geometry: hierarchical visual representations are constructed within the PoincarΓ© ball, and prior knowledge is injected via a pathology-oriented vision-language foundation model. We introduce an angular modality alignment loss for cross-modal alignment and a semantic hierarchy consistency loss to refine hierarchies under entailment/contradiction constraints. Final classification is performed via geodesic distance in hyperbolic space. Our method significantly outperforms state-of-the-art multiple-instance learning approaches across multiple WSI benchmarks, achieving 3.2β5.8% accuracy gains in cancer subtype classification. Results validate the effectiveness and generalizability of hyperbolic geometry for modeling semantic hierarchies in WSI analysis.
π Abstract
Pathology is essential for cancer diagnosis, with multiple instance learning (MIL) widely used for whole slide image (WSI) analysis. WSIs exhibit a natural hierarchy -- patches, regions, and slides -- with distinct semantic associations. While some methods attempt to leverage this hierarchy for improved representation, they predominantly rely on Euclidean embeddings, which struggle to fully capture semantic hierarchies. To address this limitation, we propose HyperPath, a novel method that integrates knowledge from textual descriptions to guide the modeling of semantic hierarchies of WSIs in hyperbolic space, thereby enhancing WSI classification. Our approach adapts both visual and textual features extracted by pathology vision-language foundation models to the hyperbolic space. We design an Angular Modality Alignment Loss to ensure robust cross-modal alignment, while a Semantic Hierarchy Consistency Loss further refines feature hierarchies through entailment and contradiction relationships and thus enhance semantic coherence. The classification is performed with geodesic distance, which measures the similarity between entities in the hyperbolic semantic hierarchy. This eliminates the need for linear classifiers and enables a geometry-aware approach to WSI analysis. Extensive experiments show that our method achieves superior performance across tasks compared to existing methods, highlighting the potential of hyperbolic embeddings for WSI analysis.