ProtoMask: Segmentation-Guided Prototype Learning

📅 2025-10-01

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing prototype-based case reasoning methods rely on post-hoc saliency techniques to interpret prototype semantics, yet suffer from limited reliability and semantic inconsistency. This paper proposes ProtoMask, the first framework to integrate foundational image segmentation models (e.g., SAM) into prototype learning: semantic segmentation masks are used to generate bounding boxes for cropping input images, thereby constraining prototype representations in the embedding space and significantly improving the fidelity of embedding-to-input space mapping. ProtoMask enables end-to-end joint optimization of segmentation-guided cropping, prototype learning, and saliency computation. It achieves state-of-the-art classification performance on three fine-grained recognition benchmarks while producing more reliable, semantically consistent, and visually verifiable explanations. The core contribution is a segmentation-driven prototype localization mechanism that fundamentally enhances prototype interpretability and physical traceability.

Technology Category

Application Category

📝 Abstract

XAI gained considerable importance in recent years. Methods based on prototypical case-based reasoning have shown a promising improvement in explainability. However, these methods typically rely on additional post-hoc saliency techniques to explain the semantics of learned prototypes. Multiple critiques have been raised about the reliability and quality of such techniques. For this reason, we study the use of prominent image segmentation foundation models to improve the truthfulness of the mapping between embedding and input space. We aim to restrict the computation area of the saliency map to a predefined semantic image patch to reduce the uncertainty of such visualizations. To perceive the information of an entire image, we use the bounding box from each generated segmentation mask to crop the image. Each mask results in an individual input in our novel model architecture named ProtoMask. We conduct experiments on three popular fine-grained classification datasets with a wide set of metrics, providing a detailed overview on explainability characteristics. The comparison with other popular models demonstrates competitive performance and unique explainability features of our model. https://github.com/uos-sis/quanproto

Problem

Research questions and friction points this paper is trying to address.

Improving truthfulness of prototype-to-input mapping using segmentation

Reducing saliency map uncertainty via semantic image patches

Enhancing explainability in fine-grained classification through segmented prototypes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Segmentation-guided prototype learning improves explainability

Uses image segmentation to restrict saliency map computation

Novel architecture processes individual masks as separate inputs

🔎 Similar Papers

No similar papers found.