Towards more holistic interpretability: A lightweight disentangled Concept Bottleneck Model

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

Existing concept bottleneck models (CBMs) suffer from input-to-concept mapping bias and insufficient controllability, undermining their interpretability and accountability. This paper proposes a lightweight decoupled CBM that achieves precise alignment between learned concepts and local image patterns via unsupervised visual feature grouping. Our approach addresses these limitations through three key innovations: (1) a filter-based grouping loss that implicitly disentangles semantically correlated features; (2) joint concept supervision, enabling end-to-end co-optimization of concept discovery and downstream classification; and (3) complete elimination of region-level annotations. Evaluated on CUB, CLEVR, and ImageNet-100, the model achieves substantial improvements in concept accuracy (+5.2% to +9.8%) and classification performance, while significantly enhancing decision transparency and user controllability—without compromising computational efficiency.

Technology Category

Application Category

📝 Abstract

Concept Bottleneck Models (CBMs) enhance interpretability by predicting human-understandable concepts as intermediate representations. However, existing CBMs often suffer from input-to-concept mapping bias and limited controllability, which restricts their practical value, directly damage the responsibility of strategy from concept-based methods. We propose a lightweight Disentangled Concept Bottleneck Model (LDCBM) that automatically groups visual features into semantically meaningful components without region annotation. By introducing a filter grouping loss and joint concept supervision, our method improves the alignment between visual patterns and concepts, enabling more transparent and robust decision-making. Notably, Experiments on three diverse datasets demonstrate that LDCBM achieves higher concept and class accuracy, outperforming previous CBMs in both interpretability and classification performance. By grounding concepts in visual evidence, our method overcomes a fundamental limitation of prior models and enhances the reliability of interpretable AI.

Problem

Research questions and friction points this paper is trying to address.

Addresses input-to-concept mapping bias in interpretable AI models

Improves visual pattern alignment with human-understandable concepts

Enhances controllability and reliability of concept bottleneck models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight model groups visual features automatically

Filter grouping loss improves visual-concept alignment

Joint supervision enhances concept and class accuracy

🔎 Similar Papers

Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations