CLEAR-HPV: Interpretable Concept Discovery for HPV-Associated Morphology in Whole-Slide Histology

📅 2026-02-04

🏛️ bioRxiv

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the limited morphological interpretability of existing attention-based multiple instance learning (MIL) models in predicting human papillomavirus (HPV) status from whole-slide histopathology images. To overcome this limitation, the authors propose an unsupervised method that automatically discovers HPV-associated histomorphological concepts without requiring concept-level annotations. By constructing an attention-weighted latent space, the approach generates spatial concept maps and compact concept proportion vectors, effectively compressing high-dimensional embeddings into only ten interpretable morphological concepts. Notably, this is the first unsupervised framework to achieve automatic extraction of HPV-related morphological concepts, and it is compatible with various backbone networks. Experiments on TCGA-HNSCC, TCGA-CESC, and CPTAC-HNSCC datasets demonstrate that the model preserves the predictive performance of conventional MIL while substantially enhancing interpretability.

Technology Category

Application Category

📝 Abstract

Human papillomavirus (HPV) status is a critical determinant of prognosis and treatment response in head and neck and cervical cancers. Although attention-based multiple instance learning (MIL) achieves strong slide-level prediction for HPV-related whole-slide histopathology, it provides limited morphologic interpretability. To address this limitation, we introduce Concept-Level Explainable Attention-guided Representation for HPV (CLEAR-HPV), a framework that restructures the MIL latent space using attention to enable concept discovery without requiring concept labels during training. Operating in an attention-weighted latent space, CLEAR-HPV automatically discovers keratinizing, basaloid, and stromal morphologic concepts, generates spatial concept maps, and represents each slide using a compact concept-fraction vector. CLEARHPV’s concept-fraction vectors preserve the predictive information of the original MIL embeddings while reducing the high-dimensional feature space (e.g., 1536 dimensions) to only 10 interpretable concepts. CLEAR-HPV generalizes consistently across TCGA-HNSCC, TCGACESC, and CPTAC-HNSCC, providing compact, concept-level interpretability through a general, backbone-agnostic framework for attention-based MIL models of whole-slide histopathology.

Problem

Research questions and friction points this paper is trying to address.

HPV

histopathology

interpretability

morphologic concepts

multiple instance learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

interpretable concept discovery

attention-based MIL

HPV-associated morphology