LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching

📅 2025-10-16

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing counterfactual explanation methods suffer from vanishing gradients, discontinuities in latent spaces, and strict reliance on precise alignment with either learned or ground-truth decision boundaries—leading to unreliable explanations. This paper proposes LeapFactual, a model-agnostic counterfactual generation framework based on conditional flow matching, integrating differentiable rendering with probabilistic modeling to explicitly decouple semantic perturbations from distributional consistency constraints. Unlike prior approaches, LeapFactual does not require exact boundary alignment and supports human-in-the-loop intervention. Evaluated across multiple benchmarks and real-world datasets, it significantly improves counterfactual plausibility, label fidelity, and distributional faithfulness. The generated counterfactuals exhibit high interpretability and practical utility: they can be directly used for model retraining and enhance AI trustworthiness in high-stakes applications.

Technology Category

Application Category

📝 Abstract

The growing integration of machine learning (ML) and artificial intelligence (AI) models into high-stakes domains such as healthcare and scientific research calls for models that are not only accurate but also interpretable. Among the existing explainable methods, counterfactual explanations offer interpretability by identifying minimal changes to inputs that would alter a model's prediction, thus providing deeper insights. However, current counterfactual generation methods suffer from critical limitations, including gradient vanishing, discontinuous latent spaces, and an overreliance on the alignment between learned and true decision boundaries. To overcome these limitations, we propose LeapFactual, a novel counterfactual explanation algorithm based on conditional flow matching. LeapFactual generates reliable and informative counterfactuals, even when true and learned decision boundaries diverge. Following a model-agnostic approach, LeapFactual is not limited to models with differentiable loss functions. It can even handle human-in-the-loop systems, expanding the scope of counterfactual explanations to domains that require the participation of human annotators, such as citizen science. We provide extensive experiments on benchmark and real-world datasets showing that LeapFactual generates accurate and in-distribution counterfactual explanations that offer actionable insights. We observe, for instance, that our reliable counterfactual samples with labels aligning to ground truth can be beneficially used as new training data to enhance the model. The proposed method is broadly applicable and enhances both scientific knowledge discovery and non-expert interpretability.

Problem

Research questions and friction points this paper is trying to address.

Generating reliable counterfactual explanations when decision boundaries diverge

Overcoming gradient vanishing and discontinuous latent space limitations

Expanding counterfactual explanations to human-in-the-loop systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses conditional flow matching for counterfactual generation

Model-agnostic approach without requiring differentiable loss functions

Handles human-in-the-loop systems for expanded applicability

🔎 Similar Papers

Benchmarking Counterfactual Image Generation