Enhancing Concept Localization in CLIP-based Concept Bottleneck Models

📅 2025-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
CLIP-based concept bottleneck models (CBMs) suffer from concept hallucination in zero-shot concept extraction, leading to erroneous judgments of concept existence and undermining explanation reliability. To address this, we propose CHILI—a Concept Hallucination-Insensitive Local Interpretability method—that enables pixel-level concept localization from image-level CLIP features via a local interpretability-guided embedding disentanglement mechanism. CHILI first isolates local features semantically relevant to the target concept and then synthesizes high-fidelity attention maps without requiring additional annotations or model fine-tuning. Experiments demonstrate that CHILI significantly reduces concept misclassification rates and produces attribution maps with superior fidelity and interpretability compared to existing zero-shot CBM approaches. By mitigating hallucination while preserving zero-shot capability, CHILI establishes a new paradigm for trustworthy, vision-language-model-based eXplainable AI (XAI).

Technology Category

Application Category

📝 Abstract
This paper addresses explainable AI (XAI) through the lens of Concept Bottleneck Models (CBMs) that do not require explicit concept annotations, relying instead on concepts extracted using CLIP in a zero-shot manner. We show that CLIP, which is central in these techniques, is prone to concept hallucination, incorrectly predicting the presence or absence of concepts within an image in scenarios used in numerous CBMs, hence undermining the faithfulness of explanations. To mitigate this issue, we introduce Concept Hallucination Inhibition via Localized Interpretability (CHILI), a technique that disentangles image embeddings and localizes pixels corresponding to target concepts. Furthermore, our approach supports the generation of saliency-based explanations that are more interpretable.
Problem

Research questions and friction points this paper is trying to address.

Mitigate concept hallucination in CLIP-based models
Enhance concept localization without explicit annotations
Generate more interpretable saliency-based explanations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces CHILI to inhibit concept hallucination
Disentangles image embeddings for concept localization
Generates saliency-based explanations for interpretability
🔎 Similar Papers
No similar papers found.
R
Rémi Kazmierczak
Unité d’Informatique et d’Ingénierie des Systèmes, ENSTA Paris, Institut Polytechnique de Paris
Steve Azzolin
Steve Azzolin
University of Trento
Graph Neural NetworksFairness
E
Eloïse Berthier
Unité d’Informatique et d’Ingénierie des Systèmes, ENSTA Paris, Institut Polytechnique de Paris
Goran Frehse
Goran Frehse
Unité d’Informatique et d’Ingénierie des Systèmes, ENSTA Paris, Institut Polytechnique de Paris
Gianni Franchi
Gianni Franchi
U2IS, ENSTA Paris, Institut Polytechnique de Paris
Computer VisionImage ProcessingMachine LearningPattern Recognition