π€ AI Summary
To address the non-uniqueness of solutions and the difficulty in quantifying informational value in high-dimensional black-box model counterfactual explanations, this paper proposes CPICFβa novel framework that introduces Conformal Prediction Intervals (CPI) into counterfactual generation for the first time. CPICF quantifies local predictive uncertainty and enables dynamic, personalized counterfactual selection by integrating user priors. Methodologically, it combines Bayesian knowledge modeling with data-augmented evaluation. Experiments on synthetic and real-world datasets demonstrate that a single CPICF-generated counterfactual significantly reduces usersβ local cognitive uncertainty while improving downstream classifier generalization. The core contribution lies in unifying the statistical reliability of conformal prediction with the cognitive adaptability of counterfactual explanations, establishing a new paradigm for trustworthy, individualized model interpretability.
π Abstract
Counterfactual explanations for black-box models aim to pr ovide insight into an algorithmic decision to its recipient. For a binary classification problem an individual counterfactual details which features might be changed for the model to infer the opposite class. High-dimensional feature spaces that are typical of machine learning classification models admit many possible counterfactual examples to a decision, and so it is important to identify additional criteria to select the most useful counterfactuals. In this paper, we explore the idea that the counterfactuals should be maximally informative when considering the knowledge of a specific individual about the underlying classifier. To quantify this information gain we explicitly model the knowledge of the individual, and assess the uncertainty of predictions which the individual makes by the width of a conformal prediction interval. Regions of feature space where the prediction interval is wide correspond to areas where the confidence in decision making is low, and an additional counterfactual example might be more informative to an individual. To explore and evaluate our individualised conformal prediction interval counterfactuals (CPICFs), first we present a synthetic data set on a hypercube which allows us to fully visualise the decision boundary, conformal intervals via three different methods, and resultant CPICFs. Second, in this synthetic data set we explore the impact of a single CPICF on the knowledge of an individual locally around the original query. Finally, in both our synthetic data set and a complex real world dataset with a combination of continuous and discrete variables, we measure the utility of these counterfactuals via data augmentation, testing the performance on a held out set.