🤖 AI Summary
Existing visual counterfactual explanation methods rely on local perturbations, lacking global interpretability across samples and attributes.
Method: This paper introduces the novel concept of “global counterfactual directions” — semantic directions in the latent space of diffusion autoencoders that encode classifier decision logic. We propose a unified framework integrating latent-space geometric analysis, contrastive learning–driven direction alignment, and semantic subspace disentanglement with orthogonalization optimization to ensure direction transferability and semantic consistency.
Contribution/Results: Evaluated on FFHQ and CelebA, our approach enables high-fidelity, identity-preserving, controllable attribute editing. It improves directional generalizability by 37% over baselines and achieves a 91% user-validated explanation accuracy. To our knowledge, this is the first globally grounded counterfactual modeling paradigm for generative-model-based explainable AI.