🤖 AI Summary
Current approaches to whole-slide image analysis in pathology often rely on post-hoc explanations, which fall short in providing accountable evidence for cancer subtype predictions. To address this limitation, this work proposes PA-MIL, a pre-explainable framework that first constructs a phenotypic knowledge base by translating genotype–phenotype associations into morphological descriptions used as language prompts. It then introduces a Genotype–Phenotype Neural Network (GP-NN) that integrates multi-instance learning with a linear classifier to enable multi-level guided, interpretable subtype classification. Evaluated across multiple datasets, PA-MIL achieves performance comparable to state-of-the-art methods while substantially enhancing interpretability and prediction reliability at both the case and cohort levels.
📝 Abstract
Deep learning has been extensively researched in the analysis of pathology whole-slide images (WSIs). However, most existing methods are limited to providing prediction interpretability by locating the model's salient areas in a post-hoc manner, failing to offer more reliable and accountable explanations. In this work, we propose Phenotype-Aware Multiple Instance Learning (PA-MIL), a novel ante-hoc interpretable framework that identifies cancer-related phenotypes from WSIs and utilizes them for cancer subtyping. To facilitate PA-MIL in learning phenotype-aware features, we 1) construct a phenotype knowledge base containing cancer-related phenotypes and their associated genotypes. 2) utilize the morphological descriptions of phenotypes as language prompting to aggregate phenotype-related features. 3) devise the Genotype-to-Phenotype Neural Network (GP-NN) grounded in genotype-to-phenotype relationships, which provides multi-level guidance for PA-MIL. Experimental results on multiple datasets demonstrate that PA-MIL achieves competitive performance compared to existing MIL methods while offering improved interpretability. PA-MIL leverages phenotype saliency as evidence and, using a linear classifier, achieves competitive results compared to state-of-the-art methods. Additionally, we thoroughly analyze the genotype-phenotype relationships, as well as cohort-level and case-level interpretability, demonstrating the reliability and accountability of PA-MIL.