Zero-shot Concept Bottleneck Models

📅 2025-02-13

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This work introduces the first training-free zero-shot concept bottleneck model (Zero-shot CBM), addressing the fundamental limitations of conventional CBMs—namely, their reliance on task-specific labeled data and end-to-end neural network training. Methodologically, it leverages cross-modal retrieval over large-scale open concept banks (e.g., ConceptBank) to directly map input images to semantic concepts without learning; subsequently, sparse linear regression models the interpretable mapping from concepts to labels, eliminating parameter optimization. The key contribution is the first realization of fully zero-shot, parameter-free, concept-level inference—enabling plug-and-play, cross-domain interpretable prediction and human-editable concept intervention. Evaluated across diverse domains, the model achieves competitive accuracy while providing transparent, verifiable concept-level reasoning paths, drastically reducing both data dependency and computational overhead.

Technology Category

Application Category

📝 Abstract

Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models, which explain their final label prediction by the intermediate prediction of high-level semantic concepts. However, they require target task training to learn input-to-concept and concept-to-label mappings, incurring target dataset collections and training resources. In this paper, we present extit{zero-shot concept bottleneck models} (Z-CBMs), which predict concepts and labels in a fully zero-shot manner without training neural networks. Z-CBMs utilize a large-scale concept bank, which is composed of millions of vocabulary extracted from the web, to describe arbitrary input in various domains. For the input-to-concept mapping, we introduce concept retrieval, which dynamically finds input-related concepts by the cross-modal search on the concept bank. In the concept-to-label inference, we apply concept regression to select essential concepts from the retrieved concepts by sparse linear regression. Through extensive experiments, we confirm that our Z-CBMs provide interpretable and intervenable concepts without any additional training. Code will be available at https://github.com/yshinya6/zcbm.

Problem

Research questions and friction points this paper is trying to address.

Zero-shot concept prediction without training

Dynamic concept retrieval from a large-scale bank

Interpretable and intervenable label explanation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot concept prediction

Cross-modal concept retrieval

Sparse linear regression selection

🔎 Similar Papers

Exploring the Limits of Zero Shot Vision Language Models for Hate Meme Detection: The Vulnerabilities and their Interpretations