🤖 AI Summary
Pretrained concept bottleneck models (CBMs) suffer from limited modeling of inter-concept dependencies, weak causal intervention capability, and high retraining costs. To address these issues, we propose the Posterior Random Concept Bottleneck Model (PR-CBM). PR-CBM introduces a lightweight covariance prediction module—without updating the backbone network—to explicitly model the joint posterior distribution of concepts as a multivariate normal distribution, enabling efficient causal interventions. It is the first CBM variant to achieve concept-level multivariate probabilistic modeling without retraining, significantly enhancing interpretability and intervention robustness. On real-world benchmarks, PR-CBM outperforms baseline CBMs in both concept and label prediction accuracy, achieves substantially improved intervention performance, and attains markedly higher training and inference efficiency compared to fully retrained stochastic models.
📝 Abstract
Concept Bottleneck Models (CBMs) are interpretable models that predict the target variable through high-level human-understandable concepts, allowing users to intervene on mispredicted concepts to adjust the final output. While recent work has shown that modeling dependencies between concepts can improve CBM performance, especially under interventions, such approaches typically require retraining the entire model, which may be infeasible when access to the original data or compute is limited. In this paper, we introduce Post-hoc Stochastic Concept Bottleneck Models (PSCBMs), a lightweight method that augments any pre-trained CBM with a multivariate normal distribution over concepts by adding only a small covariance-prediction module, without retraining the backbone model. We propose two training strategies and show on real-world data that PSCBMs consistently match or improve both concept and target accuracy over standard CBMs at test time. Furthermore, we show that due to the modeling of concept dependencies, PSCBMs perform much better than CBMs under interventions, while remaining far more efficient than retraining a similar stochastic model from scratch.