🤖 AI Summary
This work addresses the limited morphological interpretability of existing attention-based multiple instance learning (MIL) models in predicting human papillomavirus (HPV) status from whole-slide histopathology images. To overcome this limitation, the authors propose an unsupervised method that automatically discovers HPV-associated histomorphological concepts without requiring concept-level annotations. By constructing an attention-weighted latent space, the approach generates spatial concept maps and compact concept proportion vectors, effectively compressing high-dimensional embeddings into only ten interpretable morphological concepts. Notably, this is the first unsupervised framework to achieve automatic extraction of HPV-related morphological concepts, and it is compatible with various backbone networks. Experiments on TCGA-HNSCC, TCGA-CESC, and CPTAC-HNSCC datasets demonstrate that the model preserves the predictive performance of conventional MIL while substantially enhancing interpretability.
📝 Abstract
Human papillomavirus (HPV) status is a critical determinant of prognosis and treatment response in head and neck and cervical cancers. Although attention-based multiple instance learning (MIL) achieves strong slide-level prediction for HPV-related whole-slide histopathology, it provides limited morphologic interpretability. To address this limitation, we introduce Concept-Level Explainable Attention-guided Representation for HPV (CLEAR-HPV), a framework that restructures the MIL latent space using attention to enable concept discovery without requiring concept labels during training. Operating in an attention-weighted latent space, CLEAR-HPV automatically discovers keratinizing, basaloid, and stromal morphologic concepts, generates spatial concept maps, and represents each slide using a compact concept-fraction vector. CLEARHPV’s concept-fraction vectors preserve the predictive information of the original MIL embeddings while reducing the high-dimensional feature space (e.g., 1536 dimensions) to only 10 interpretable concepts. CLEAR-HPV generalizes consistently across TCGA-HNSCC, TCGACESC, and CPTAC-HNSCC, providing compact, concept-level interpretability through a general, backbone-agnostic framework for attention-based MIL models of whole-slide histopathology.