Chat-CBM: Towards Interactive Concept Bottleneck Models with Frozen Large Language Models

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Traditional concept bottleneck models (CBMs) are constrained by numeric interventions, hindering the incorporation of novel concepts or external knowledge at test time; unsupervised CBMs further suffer from noisy, sparse concept activations that render user interventions ineffective. To address these limitations, we propose Chat-CBM—a novel framework that integrates a frozen large language model (LLM) into the CBM architecture, replacing conventional scalar concept classifiers with a language-based classifier. This enables direct semantic reasoning over concepts and supports high-level, interpretable interventions—including concept correction, addition/removal, and knowledge integration. Leveraging few-shot prompting, Chat-CBM harnesses the LLM’s linguistic understanding to establish an explainable, interactive, language-driven classification mechanism. Extensive experiments across nine benchmark datasets demonstrate that Chat-CBM significantly improves both predictive accuracy and the efficacy of human-in-the-loop interventions, while preserving strong concept-level interpretability.

Technology Category

Application Category

📝 Abstract

Concept Bottleneck Models (CBMs) provide inherent interpretability by first predicting a set of human-understandable concepts and then mapping them to labels through a simple classifier. While users can intervene in the concept space to improve predictions, traditional CBMs typically employ a fixed linear classifier over concept scores, which restricts interventions to manual value adjustments and prevents the incorporation of new concepts or domain knowledge at test time. These limitations are particularly severe in unsupervised CBMs, where concept activations are often noisy and densely activated, making user interventions ineffective. We introduce Chat-CBM, which replaces score-based classifiers with a language-based classifier that reasons directly over concept semantics. By grounding prediction in the semantic space of concepts, Chat-CBM preserves the interpretability of CBMs while enabling richer and more intuitive interventions, such as concept correction, addition or removal of concepts, incorporation of external knowledge, and high-level reasoning guidance. Leveraging the language understanding and few-shot capabilities of frozen large language models, Chat-CBM extends the intervention interface of CBMs beyond numerical editing and remains effective even in unsupervised settings. Experiments on nine datasets demonstrate that Chat-CBM achieves higher predictive performance and substantially improves user interactivity while maintaining the concept-based interpretability of CBMs.

Problem

Research questions and friction points this paper is trying to address.

Traditional CBMs use fixed linear classifiers limiting concept interventions

Unsupervised CBMs have noisy activations making user interventions ineffective

Score-based classifiers prevent semantic reasoning and knowledge incorporation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Replaces linear classifiers with language-based reasoning

Enables semantic interventions like concept correction and addition

Leverages frozen large language models for interpretability

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs