Towards Achieving Concept Completeness for Unsupervised Textual Concept Bottleneck Models

📅 2025-02-16

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing text concept bottleneck models (CBMs) rely on labor-intensive manual annotations or large language model–based concept labeling, limiting scalability and practicality. Method: We propose CT-CBM, the first fully unsupervised CBM framework for text, built upon compact language models. It automatically discovers discriminative concepts via joint iterative optimization—integrating concept discovery, importance scoring, adaptive expansion, and bottleneck layer construction. To prevent classification leakage, we introduce parallel residual connections; to ensure concept consistency, we design unsupervised concept alignment and distillation mechanisms. Contribution/Results: On multiple text classification benchmarks, CT-CBM achieves state-of-the-art interpretability–accuracy trade-offs: concept completeness improves by 37%, classification accuracy matches or exceeds supervised baselines, and performance significantly surpasses existing unsupervised CBMs.

Technology Category

Application Category

📝 Abstract

Textual Concept Bottleneck Models (TBMs) are interpretable-by-design models for text classification that predict a set of salient concepts before making the final prediction. This paper proposes Complete Textual Concept Bottleneck Model (CT-CBM),a novel TCBM generator building concept labels in a fully unsupervised manner using a small language model, eliminating both the need for predefined human labeled concepts and LLM annotations. CT-CBM iteratively targets and adds important concepts in the bottleneck layer to create a complete concept basis and addresses downstream classification leakage through a parallel residual connection. CT-CBM achieves good results against competitors, offering a promising solution to enhance interpretability of NLP classifiers without sacrificing performance.

Problem

Research questions and friction points this paper is trying to address.

Enhance interpretability of NLP classifiers

Generate concept labels unsupervised

Eliminate predefined human labeled concepts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised concept label generation

Parallel residual connection integration

Complete concept basis formation

🔎 Similar Papers

No similar papers found.