LLEXICORP: End-user Explainability of Convolutional Neural Networks

📅 2025-11-04

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Existing Concept Relevance Propagation (CRP) methods localize semantic concept channels underlying CNN decisions but rely heavily on manual intervention for concept naming and explanation generation, limiting scalability and accessibility. This paper introduces LLEXICORP—the first modular, decoupled interpretability framework that separates concept activation naming from natural language explanation generation. Leveraging multimodal large language models (MLLMs), LLEXICORP automatically identifies and names salient concept channels extracted by CRP and employs example-guided prompt engineering to generate faithful, readable, and audience-tailored explanations. Experiments on VGG16/ImageNet demonstrate that LLEXICORP significantly improves explanation intuitiveness and consistency, lowers the barrier to understanding deep models, and establishes a novel, automated, and scalable paradigm for eXplainable AI (XAI).

Technology Category

Application Category

📝 Abstract

Convolutional neural networks (CNNs) underpin many modern computer vision systems. With applications ranging from common to critical areas, a need to explain and understand the model and its decisions (XAI) emerged. Prior works suggest that in the top layers of CNNs, the individual channels can be attributed to classifying human-understandable concepts. Concept relevance propagation (CRP) methods can backtrack predictions to these channels and find images that most activate these channels. However, current CRP workflows are largely manual: experts must inspect activation images to name the discovered concepts and must synthesize verbose explanations from relevance maps, limiting the accessibility of the explanations and their scalability. To address these issues, we introduce Large Language model EXplaIns COncept Relevance Propagation (LLEXICORP), a modular pipeline that couples CRP with a multimodal large language model. Our approach automatically assigns descriptive names to concept prototypes and generates natural-language explanations that translate quantitative relevance distributions into intuitive narratives. To ensure faithfulness, we craft prompts that teach the language model the semantics of CRP through examples and enforce a separation between naming and explanation tasks. The resulting text can be tailored to different audiences, offering low-level technical descriptions for experts and high-level summaries for non-technical stakeholders. We qualitatively evaluate our method on various images from ImageNet on a VGG16 model. Our findings suggest that integrating concept-based attribution methods with large language models can significantly lower the barrier to interpreting deep neural networks, paving the way for more transparent AI systems.

Problem

Research questions and friction points this paper is trying to address.

Automating manual concept naming and explanation generation in neural networks

Translating quantitative relevance maps into intuitive natural language narratives

Making AI explanations scalable and accessible to both technical and non-technical audiences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automates concept naming via multimodal language model

Generates tailored natural-language explanations for predictions

Ensures explanation faithfulness through structured prompting

🔎 Similar Papers

Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach