π€ AI Summary
Existing concept-based models (CBMs) offer interpretability only for final task predictions, while the concept generation process remains a black box. Method: We propose H-CMR, the first CBM that achieves full interpretability across both concept generation and task prediction. H-CMR models logical dependencies among concepts via a directed acyclic graph (DAG) and employs neural attention to dynamically select differentiable logical rules for hierarchical reasoning. It supports human-in-the-loop intervention during inference and enables incorporation of prior knowledge during training. Contribution/Results: By unifying concept learning, graph-structured modeling, and differentiable logical reasoning, H-CMR preserves state-of-the-art (SOTA) predictive performance while significantly improving model interactivity, accuracy, and few-shot generalization capability.
π Abstract
Concept-Based Models (CBMs) are a class of deep learning models that provide interpretability by explaining predictions through high-level concepts. These models first predict concepts and then use them to perform a downstream task. However, current CBMs offer interpretability only for the final task prediction, while the concept predictions themselves are typically made via black-box neural networks. To address this limitation, we propose Hierarchical Concept Memory Reasoner (H-CMR), a new CBM that provides interpretability for both concept and task predictions. H-CMR models relationships between concepts using a learned directed acyclic graph, where edges represent logic rules that define concepts in terms of other concepts. During inference, H-CMR employs a neural attention mechanism to select a subset of these rules, which are then applied hierarchically to predict all concepts and the final task. Experimental results demonstrate that H-CMR matches state-of-the-art performance while enabling strong human interaction through concept and model interventions. The former can significantly improve accuracy at inference time, while the latter can enhance data efficiency during training when background knowledge is available.