🤖 AI Summary
Existing interpretable AI methods identify semantic concept directions via computationally expensive, exhaustive latent-space traversal, hindering efficient discovery of complex, class-specific high-level concepts. This paper proposes a novel framework that generates counterfactual image pairs using diffusion models, then clusters their latent-space difference vectors—bypassing per-dimension traversal for the first time. Concept directions are automatically extracted as multidimensional, global, and class-specific subspaces via K-means or spectral clustering. The method integrates diffusion-based generation, VAE latent representations, and differential modeling to enable cross-modal concept discovery. Evaluated on a dermoscopic dataset, the learned directions precisely recover clinically validated diagnostic features (e.g., “pigment network”, “blue-white structures”), expose dataset biases, and uncover potential novel biomarkers. The resulting directions are human-interpretable and grounded in clinical validation.
📝 Abstract
Concept-based explanations have emerged as an effective approach within Explainable Artificial Intelligence, enabling interpretable insights by aligning model decisions with human-understandable concepts. However, existing methods rely on computationally intensive procedures and struggle to efficiently capture complex, semantic concepts. Recently, the Concept Discovery through Latent Diffusion-based Counterfactual Trajectories (CDCT) framework, introduced by Varshney et al. (2025), attempts to identify concepts via dimension-wise traversal of the latent space of a Variational Autoencoder trained on counterfactual trajectories. Extending the CDCT framework, this work introduces Concept Directions via Latent Clustering (CDLC), which extracts global, class-specific concept directions by clustering latent difference vectors derived from factual and diffusion-generated counterfactual image pairs. CDLC substantially reduces computational complexity by eliminating the exhaustive latent dimension traversal required in CDCT and enables the extraction of multidimensional semantic concepts encoded across the latent dimensions. This approach is validated on a real-world skin lesion dataset, demonstrating that the extracted concept directions align with clinically recognized dermoscopic features and, in some cases, reveal dataset-specific biases or unknown biomarkers. These results highlight that CDLC is interpretable, scalable, and applicable across high-stakes domains and diverse data modalities.