Matryoshka Concept Bottleneck Models

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the high intervention cost and reliance on extensive expert annotations inherent in traditional concept bottleneck models, which also require maintaining multiple separate models. The authors propose a unified architecture that, for the first time, integrates Matryoshka representation learning into concept bottleneck models. By constructing a nested, hierarchical concept structure guided by the principle of maximum relevance and minimum redundancy, the approach enables multi-granularity adaptive inference within a single model. Crucially, it allows dynamic adjustment of concept usage during inference without retraining, reducing intervention complexity from linear to logarithmic while ensuring monotonic performance improvement. The method achieves accuracy comparable to that of independent models yet substantially lowers expert annotation overhead, thereby facilitating efficient and flexible human-AI collaborative reasoning.

📝 Abstract

Concept Bottleneck Models (CBMs) have emerged as a prominent paradigm for interpretable deep learning, learning by grounding predictions in human-understandable concepts. However, their practical deployment is hindered by the high cost of test-time intervention, as correcting model errors typically requires human experts to manually inspect and verify a large set of predicted concepts. Existing approaches suffer from a fundamental structural limitation: they either adopt a single static concept set, forcing experts to exhaustively annotate concepts and incurring prohibitive intervention costs, or train multiple models tailored to different concept budgets, resulting in substantial computational and maintenance overhead. To address this challenge, we propose the Matryoshka Concept Bottleneck Model (MCBM), a unified architecture that enables adaptive concept utilization within a single model. Inspired by Matryoshka Representation Learning, MCBM organizes concepts into a nested hierarchy based on maximum relevance and minimum redundancy, allowing inference at multiple levels of conceptual granularity without retraining. Theoretically, we show that MCBM reduces the expected intervention costs from linear to logarithmic order, $O(\log K)$, while guaranteeing monotonic performance improvement. Empirically, extensive experiments demonstrate that MCBM matches the performance of independently trained models while enabling dynamic and efficient expert interaction.

Problem

Research questions and friction points this paper is trying to address.

Concept Bottleneck Models

test-time intervention

concept annotation cost

model maintenance overhead

interpretable deep learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Concept Bottleneck Models

Matryoshka Representation

Interpretable AI