Towards generating more interpretable counterfactuals via concept vectors: a preliminary study on chest X-rays

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address weak interpretability and poor clinical consistency of counterfactual explanations in medical imaging (e.g., chest X-rays), this paper proposes an unsupervised, label-agnostic, clinically concept-driven counterfactual generation framework. Methodologically, it introduces the first human-annotation-free Concept Activation Vector (CAV) construction for chest radiographs, aligning clinical concepts with latent features via a reconstruction-based autoencoder; counterfactual images are then generated by directional traversal along concept directions in the latent space. Key contributions include: (1) the first unsupervised CAV paradigm enabling stable cross-dataset transfer; and (2) an editable, interpretable clinical concept-guided generation mechanism. Experiments demonstrate clinically consistent counterfactuals for large pathologies (e.g., cardiomegaly), though small lesions remain limited by reconstruction fidelity—highlighting a need for further refinement.

Technology Category

Application Category

📝 Abstract

An essential step in deploying medical imaging models is ensuring alignment with clinical knowledge and interpretability. We focus on mapping clinical concepts into the latent space of generative models to identify Concept Activation Vectors (CAVs). Using a simple reconstruction autoencoder, we link user-defined concepts to image-level features without explicit label training. The extracted concepts are stable across datasets, enabling visual explanations that highlight clinically relevant features. By traversing latent space along concept directions, we produce counterfactuals that exaggerate or reduce specific clinical features. Preliminary results on chest X-rays show promise for large pathologies like cardiomegaly, while smaller pathologies remain challenging due to reconstruction limits. Although not outperforming baselines, this approach offers a path toward interpretable, concept-based explanations aligned with clinical knowledge.

Problem

Research questions and friction points this paper is trying to address.

Mapping clinical concepts into generative model latent space

Generating interpretable counterfactuals via concept vectors

Linking user-defined concepts to image features without label training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mapping clinical concepts into latent space

Using autoencoder for concept-image linking

Generating counterfactuals via concept vectors

🔎 Similar Papers

Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery