🤖 AI Summary
Concept bottleneck models (CBMs) suffer from unreliable concept representations—prone to spurious background correlations and exhibiting inconsistent semantic meanings across samples—leading to poor generalization and robustness. To address this, we propose the Reliability-Enhanced Concept Embedding Model (RECEM). RECEM introduces a novel concept-level feature disentanglement mechanism that explicitly separates concept-relevant from concept-irrelevant representations. Additionally, it employs a concept mixing strategy that enforces cross-sample semantic alignment of concepts in the latent space. By unifying concept bottleneck modeling, disentangled representation learning, and Mixup-based augmentation, RECEM jointly enhances interpretability and robustness. Extensive experiments on multiple benchmark datasets demonstrate that RECEM improves concept prediction accuracy by an average of 8.2% under background perturbations and domain shifts, while simultaneously boosting downstream task performance and stability.
📝 Abstract
Concept Bottleneck Models (CBMs) aim to enhance interpretability by predicting human-understandable concepts as intermediates for decision-making. However, these models often face challenges in ensuring reliable concept representations, which can propagate to downstream tasks and undermine robustness, especially under distribution shifts. Two inherent issues contribute to concept unreliability: sensitivity to concept-irrelevant features (e.g., background variations) and lack of semantic consistency for the same concept across different samples. To address these limitations, we propose the Reliability-Enhanced Concept Embedding Model (RECEM), which introduces a two-fold strategy: Concept-Level Disentanglement to separate irrelevant features from concept-relevant information and a Concept Mixup mechanism to ensure semantic alignment across samples. These mechanisms work together to improve concept reliability, enabling the model to focus on meaningful object attributes and generate faithful concept representations. Experimental results demonstrate that RECEM consistently outperforms existing baselines across multiple datasets, showing superior performance under background and domain shifts. These findings highlight the effectiveness of disentanglement and alignment strategies in enhancing both reliability and robustness in CBMs.