CLEAR-HPV: Interpretable Concept Discovery for HPV-Associated Morphology in Whole-Slide Histology

📅 2026-02-04
🏛️ bioRxiv
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited morphological interpretability of existing attention-based multiple instance learning (MIL) models in predicting human papillomavirus (HPV) status from whole-slide histopathology images. To overcome this limitation, the authors propose an unsupervised method that automatically discovers HPV-associated histomorphological concepts without requiring concept-level annotations. By constructing an attention-weighted latent space, the approach generates spatial concept maps and compact concept proportion vectors, effectively compressing high-dimensional embeddings into only ten interpretable morphological concepts. Notably, this is the first unsupervised framework to achieve automatic extraction of HPV-related morphological concepts, and it is compatible with various backbone networks. Experiments on TCGA-HNSCC, TCGA-CESC, and CPTAC-HNSCC datasets demonstrate that the model preserves the predictive performance of conventional MIL while substantially enhancing interpretability.

Technology Category

Application Category

📝 Abstract
Human papillomavirus (HPV) status is a critical determinant of prognosis and treatment response in head and neck and cervical cancers. Although attention-based multiple instance learning (MIL) achieves strong slide-level prediction for HPV-related whole-slide histopathology, it provides limited morphologic interpretability. To address this limitation, we introduce Concept-Level Explainable Attention-guided Representation for HPV (CLEAR-HPV), a framework that restructures the MIL latent space using attention to enable concept discovery without requiring concept labels during training. Operating in an attention-weighted latent space, CLEAR-HPV automatically discovers keratinizing, basaloid, and stromal morphologic concepts, generates spatial concept maps, and represents each slide using a compact concept-fraction vector. CLEARHPV’s concept-fraction vectors preserve the predictive information of the original MIL embeddings while reducing the high-dimensional feature space (e.g., 1536 dimensions) to only 10 interpretable concepts. CLEAR-HPV generalizes consistently across TCGA-HNSCC, TCGACESC, and CPTAC-HNSCC, providing compact, concept-level interpretability through a general, backbone-agnostic framework for attention-based MIL models of whole-slide histopathology.
Problem

Research questions and friction points this paper is trying to address.

HPV
histopathology
interpretability
morphologic concepts
multiple instance learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

interpretable concept discovery
attention-based MIL
HPV-associated morphology
concept-fraction vector
whole-slide histopathology
🔎 Similar Papers
No similar papers found.
W
Weiyi Qin
Department of Computer Science, Rutgers University, New Brunswick, NJ, USA
Y
Yingci Liu-Swetz
Rutgers Health, Rutgers University, Newark, NJ, USA
S
Shiwei Tan
Department of Computer Science, Rutgers University, New Brunswick, NJ, USA
Hao Wang
Hao Wang
Assistant Professor of Computer Science, Rutgers University
Statistical Machine LearningDeep LearningBayesian Deep LearningML4HealthRecommender Systems / Data Mining