Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

📅 2025-03-26

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

Current language bottleneck models (LBMs) for image recognition face two key challenges: (1) concept flattening, which leads to spurious correlation-based inference, and (2) poor generalization to unseen classes. To address these, we propose an attribute-driven, class-specific concept space that models textual concepts as a fine-grained “class × attribute” structure, enabling both interpretable and generalizable recognition. Our method introduces three core innovations: (1) the first systematic paradigm for constructing class-specific concept spaces; (2) visual attribute prompting learning (VAPL), which enhances vision–language alignment accuracy; and (3) a description–summarization–supplementation (DSS) strategy that jointly optimizes zero-shot transferability and annotation efficiency. Evaluated on nine few-shot benchmarks, our approach significantly improves interpretability and cross-class generalization, supports zero-shot concept transfer, and achieves state-of-the-art performance. We publicly release both the source code and a high-quality, curated concept set.

Technology Category

Application Category

📝 Abstract

Language Bottleneck Models (LBMs) are proposed to achieve interpretable image recognition by classifying images based on textual concept bottlenecks. However, current LBMs simply list all concepts together as the bottleneck layer, leading to the spurious cue inference problem and cannot generalized to unseen classes. To address these limitations, we propose the Attribute-formed Language Bottleneck Model (ALBM). ALBM organizes concepts in the attribute-formed class-specific space, where concepts are descriptions of specific attributes for specific classes. In this way, ALBM can avoid the spurious cue inference problem by classifying solely based on the essential concepts of each class. In addition, the cross-class unified attribute set also ensures that the concept spaces of different classes have strong correlations, as a result, the learned concept classifier can be easily generalized to unseen classes. Moreover, to further improve interpretability, we propose Visual Attribute Prompt Learning (VAPL) to extract visual features on fine-grained attributes. Furthermore, to avoid labor-intensive concept annotation, we propose the Description, Summary, and Supplement (DSS) strategy to automatically generate high-quality concept sets with a complete and precise attribute. Extensive experiments on 9 widely used few-shot benchmarks demonstrate the interpretability, transferability, and performance of our approach. The code and collected concept sets are available at https://github.com/tiggers23/ALBM.

Problem

Research questions and friction points this paper is trying to address.

Address spurious cue inference in Language Bottleneck Models

Improve interpretability via attribute-formed class-specific concept space

Enable generalization to unseen classes with unified attributes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Organizes concepts in class-specific attribute space

Uses Visual Attribute Prompt Learning

Automates concept generation with DSS strategy

🔎 Similar Papers

VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance