Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

📅 2025-04-25

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

Concept bottleneck models (CBMs) suffer severe performance degradation—up to 25%—under concept label noise. To address this, we propose Concept Preference Optimization (CPO), the first framework to integrate direct preference optimization (DPO) into concept learning. CPO explicitly models the posterior distribution over concepts and introduces a robust loss function grounded in preference-based ranking rather than pointwise classification. We theoretically prove that CPO is intrinsically insensitive to concept label noise, overcoming fundamental limitations of binary cross-entropy (BCE) in mislabeled scenarios. Extensive experiments on three real-world datasets demonstrate that CPO consistently outperforms BCE—both with and without synthetically injected concept noise—yielding substantial improvements in model robustness, interpretability, and generalization.

Technology Category

Application Category

📝 Abstract

Concept Bottleneck Models (CBMs) propose to enhance the trustworthiness of AI systems by constraining their decisions on a set of human understandable concepts. However, CBMs typically assume that datasets contains accurate concept labels an assumption often violated in practice, which we show can significantly degrade performance (by 25% in some cases). To address this, we introduce the Concept Preference Optimization (CPO) objective, a new loss function based on Direct Preference Optimization, which effectively mitigates the negative impact of concept mislabeling on CBM performance. We provide an analysis on some key properties of the CPO objective showing it directly optimizes for the concept's posterior distribution, and contrast it against Binary Cross Entropy (BCE) where we show CPO is inherently less sensitive to concept noise. We empirically confirm our analysis finding that CPO consistently outperforms BCE in three real world datasets with and without added label noise.

Problem

Research questions and friction points this paper is trying to address.

Mitigating concept mislabeling in Concept Bottleneck Models

Improving CBM performance with Concept Preference Optimization

Reducing sensitivity to concept noise compared to BCE

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Concept Preference Optimization (CPO) objective

Optimizes concept's posterior distribution directly

Less sensitive to concept noise than BCE

🔎 Similar Papers

No similar papers found.