On The Variability of Concept Activation Vectors

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This paper addresses the instability in Concept Activation Vectors (CAVs) arising from random sampling during construction. We conduct a fine-grained theoretical analysis and empirical study, providing the first rigorous proof that the estimation variance of CAVs decays at rate $1/N$ with respect to sample size $N$, thereby characterizing their statistical convergence behavior. Systematic experiments on real-world datasets empirically validate this theoretical bound and quantify the magnitude of explanation fluctuation across varying sample sizes. Our work establishes the first universal variance bound for CAVs, enabling principled guidance for sampling design. Leveraging this bound, we propose an optimal sampling strategy that jointly maximizes explanation stability and computational efficiency. The resulting approach significantly enhances the reliability and practical applicability of CAV-based model interpretations.

Technology Category

Application Category

📝 Abstract

One of the most pressing challenges in artificial intelligence is to make models more transparent to their users. Recently, explainable artificial intelligence has come up with numerous method to tackle this challenge. A promising avenue is to use concept-based explanations, that is, high-level concepts instead of plain feature importance score. Among this class of methods, Concept Activation vectors (CAVs), Kim et al. (2018) stands out as one of the main protagonists. One interesting aspect of CAVs is that their computation requires sampling random examples in the train set. Therefore, the actual vectors obtained may vary from user to user depending on the randomness of this sampling. In this paper, we propose a fine-grained theoretical analysis of CAVs construction in order to quantify their variability. Our results, confirmed by experiments on several real-life datasets, point out towards an universal result: the variance of CAVs decreases as $1/N$, where $N$ is the number of random examples. Based on this we give practical recommendations for a resource-efficient application of the method.

Problem

Research questions and friction points this paper is trying to address.

Quantifying variability in Concept Activation Vectors computation

Analyzing how random sampling affects CAVs consistency

Providing resource-efficient recommendations for CAVs application

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantifies CAV variability through theoretical analysis

Establishes variance decreases as 1/N with examples

Provides resource-efficient recommendations for CAV application

🔎 Similar Papers

Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence