Object Centric Concept Bottlenecks

πŸ“… 2025-05-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing concept bottleneck models (CBMs) based on global image encodings suffer from limited expressivity, hindering their application to complex visual reasoning tasksβ€”such as multi-object recognition, multi-label classification, and structured reasoning. To address this, we propose the Object-Centric Concept Bottleneck Model (OC-CBM), the first CBM framework grounded in the object-centric paradigm. OC-CBM integrates pretrained foundation models (e.g., CLIP or SAM) with object detection or segmentation modules to construct an object-level concept encoder and a learnable aggregation mechanism. This enables fine-grained, interpretable concept activation and linear decision-making at the object level. By decoupling concept learning from global image representations, OC-CBM significantly improves accuracy and concept-level verifiability on multi-object classification and attribute reasoning tasks, while preserving strong representational capacity and model transparency.

Technology Category

Application Category

πŸ“ Abstract
Developing high-performing, yet interpretable models remains a critical challenge in modern AI. Concept-based models (CBMs) attempt to address this by extracting human-understandable concepts from a global encoding (e.g., image encoding) and then applying a linear classifier on the resulting concept activations, enabling transparent decision-making. However, their reliance on holistic image encodings limits their expressiveness in object-centric real-world settings and thus hinders their ability to solve complex vision tasks beyond single-label classification. To tackle these challenges, we introduce Object-Centric Concept Bottlenecks (OCB), a framework that combines the strengths of CBMs and pre-trained object-centric foundation models, boosting performance and interpretability. We evaluate OCB on complex image datasets and conduct a comprehensive ablation study to analyze key components of the framework, such as strategies for aggregating object-concept encodings. The results show that OCB outperforms traditional CBMs and allows one to make interpretable decisions for complex visual tasks.
Problem

Research questions and friction points this paper is trying to address.

Developing interpretable yet high-performing AI models
Improving concept-based models for object-centric tasks
Enhancing performance and transparency in complex vision tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines CBMs with object-centric foundation models
Aggregates object-concept encodings for better performance
Enhances interpretability in complex vision tasks