🤖 AI Summary
Existing counterfactual explanation methods struggle to generate diverse, semantically meaningful counterfactuals for a single class, where multiple valid decision pathways exist. Method: We propose a within-class multimodal counterfactual generation framework based on a Diffusion Autoencoder (Diffusion AE). It integrates diffusion probabilistic models with variational autoencoders to explicitly model multimodal latent distributions; spectral clustering identifies distinct intra-class modal clusters, revealing interpretable sub-paths in the decision process; gradient-guided optimization then enables modality-controllable, manifold-aware counterfactual generation. Contribution/Results: Evaluated on multiple benchmark datasets, our method significantly outperforms state-of-the-art approaches. The generated counterfactuals exhibit high plausibility, diversity, and fidelity—demonstrating improved interpretability and trustworthiness of deep learning models without compromising predictive accuracy.
📝 Abstract
Generating multiple counterfactual explanations for different modes within a class presents a significant challenge, as these modes are distinct yet converge under the same classification. Diffusion probabilistic models (DPMs) have demonstrated a strong ability to capture the underlying modes of data distributions. In this paper, we harness the power of a Diffusion Autoencoder to generate multiple distinct counterfactual explanations. By clustering in the latent space, we uncover the directions corresponding to the different modes within a class, enabling the generation of diverse and meaningful counterfactuals. We introduce a novel methodology, DifCluE, which consistently identifies these modes and produces more reliable counterfactual explanations. Our experimental results demonstrate that DifCluE outperforms the current state-of-the-art in generating multiple counterfactual explanations, offering a significant advance- ment in model interpretability.