Hierarchical Prototype-based Domain Priors for Multiple Instance Learning in Multimodal Histopathology Analysis

πŸ“… 2026-04-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

209K/year
πŸ€– AI Summary
Existing multiple instance learning (MIL) approaches treat whole-slide images as unstructured collections of image patches, thereby neglecting the morphological semantics and spatial geometric relationships inherent in tissue architecture. This limitation renders them susceptible to background noise and misaligned with clinical diagnostic reasoning. To address this, this work proposes the HPDP framework, which introduces a Morphology-Anchored Prototype System (MAPS) to explicitly model histological structural semantics, incorporates sinusoidal positional encoding (SPE) to capture spatial geometry, and designs a Hierarchical Cross-Modal Alignment (HCMA) module that leverages pathology descriptions generated by large language models to achieve image–text semantic alignment. Evaluated across seven cancer cohorts, the proposed method significantly improves diagnostic accuracy, robustness, and interpretability, outperforming current state-of-the-art approaches.

Technology Category

Application Category

πŸ“ Abstract
Digital pathology has fundamentally altered diagnostic workflows by enabling the computational analysis of gigapixel Whole Slide Images (WSIs), yet effectively deciphering their complex tumor microenvironments remains a formidable challenge. Existing Multiple Instance Learning (MIL) frameworks typically treat Whole Slide Images as unstructured bags of patches, discarding critical morphological semantics and spatial geometry. This lack of inductive bias often leads to overfitting on background noise and fails to align visual features with high-level diagnostic knowledge. To overcome these limitations, we propose the Hierarchical Prototype-based Domain Priors (HPDP) framework, a unified multimodal approach for joint histopathology diagnosis and prognosis. HPDP mitigates the data-driven "black box" issue by introducing a Morphologically Anchored Prototype System (MAPS), which anchors learning to interpretable morphological clusters, and a Sinusoidal Positional Encoder (SPE) to explicitly model tissue architecture. Furthermore, we bridge the semantic gap via a Hierarchical Cross-Modal Alignment (HCMA) module, using Large Language Model (LLM)-generated descriptions to contextually refine visual representations. Extensive experiments across seven cancer cohorts demonstrate that HPDP consistently achieves state-of-the-art performance with superior robustness and interpretability.
Problem

Research questions and friction points this paper is trying to address.

Multiple Instance Learning
Whole Slide Images
morphological semantics
spatial geometry
diagnostic knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Prototype
Morphologically Anchored Prototype System
Sinusoidal Positional Encoder
Hierarchical Cross-Modal Alignment
Multiple Instance Learning
πŸ”Ž Similar Papers
X
Xuemei Qiu
College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
D
Dawei Fan
College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
Y
Yebin Huang
College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
Yanping Chen
Yanping Chen
South China Normal University
L
Lifang Wei
College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China; College of Future Technology, Fujian Agriculture and Forestry University, Fuzhou, 350002, China; Digital Fujian Institute of Agricultural Big Data, Fujian Agriculture and Forestry University, Fuzhou, 350002, China