Cross- and Intra-image Prototypical Learning for Multi-label Disease Diagnosis and Interpretation

📅 2024-11-07

🏛️ IEEE Transactions on Medical Imaging

📈 Citations: 1

✨ Influential: 0

career value

191K/year

🤖 AI Summary

In multi-label medical image diagnosis, concurrent diseases cause semantic entanglement—leading to prototype confusion, distorted activation maps, and poor interpretability. To address this, we propose the Cross-Image and Intra-Image Prototype Learning (CIPL) framework. CIPL introduces a novel cross-image semantic decoupling mechanism for prototype learning and incorporates a dual-level alignment regularization: inter-image semantic alignment and intra-image spatial consistency constraints. This effectively disentangles co-occurring disease representations, enhancing prototype discriminability and localization reliability. Evaluated on two large-scale public multi-label benchmarks—chest X-rays and fundus images—CIPL achieves state-of-the-art classification performance. Moreover, in weakly supervised thoracic disease localization, it significantly outperforms prevailing saliency-based and prototype-based explanation methods, simultaneously delivering high classification accuracy and strong model interpretability.

Technology Category

Application Category

📝 Abstract

Recent advances in prototypical learning have shown remarkable potential to provide useful decision interpretations associating activation maps and predictions with class-specific training prototypes. Such prototypical learning has been well-studied for various single-label diseases, but for quite relevant and more challenging multi-label diagnosis, where multiple diseases are often concurrent within an image, existing prototypical learning models struggle to obtain meaningful activation maps and effective class prototypes due to the entanglement of the multiple diseases. In this paper, we present a novel Cross- and Intra-image Prototypical Learning (CIPL) framework, for accurate multi-label disease diagnosis and interpretation from medical images. CIPL takes advantage of common cross-image semantics to disentangle the multiple diseases when learning the prototypes, allowing a comprehensive understanding of complicated pathological lesions. Furthermore, we propose a new two-level alignment-based regularisation strategy that effectively leverages consistent intra-image information to enhance interpretation robustness and predictive performance. Extensive experiments show that our CIPL attains the state-of-the-art (SOTA) classification accuracy in two public multi-label benchmarks of disease diagnosis: thoracic radiography and fundus images. Quantitative interpretability results show that CIPL also has superiority in weakly-supervised thoracic disease localisation over other leading saliency- and prototype-based explanation methods.

Problem

Research questions and friction points this paper is trying to address.

Disentangling multiple diseases in multi-label diagnosis using prototypes

Improving activation maps and class prototypes for concurrent diseases

Enhancing interpretation robustness and predictive performance in medical images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-image semantics disentangles multi-label diseases

Two-level alignment enhances interpretation robustness

State-of-the-art accuracy in disease diagnosis benchmarks

🔎 Similar Papers

Multi-modal vision-language model for generalizable annotation-free pathology localization and clinical diagnosis