๐ค AI Summary
Baseline selection critically affects the reliability, fairness, and interpretability of feature attribution in explainable AI (XAI), yet existing approaches rely heavily on heuristic or domain-specific choices, introducing subjectivity and ambiguity.
Method: We propose a decision-boundary-guided automated baseline selection method that identifies baselines by sampling near the modelโs decision boundaryโa theoretically grounded, lightweight, and semantically consistent search domain. Our approach requires no model retraining and is compatible with mainstream attribution algorithms including Integrated Gradients (IG) and Grad-CAM.
Contribution/Results: We provide the first systematic theoretical analysis revealing how baseline choice impacts attribution stability and semantic plausibility. Empirical evaluation on synthetic and real-world datasets demonstrates substantial reductions in attribution subjectivity and ambiguity, alongside improved cross-algorithm explanation consistency. An open-source implementation and reproducible guidelines are provided to advance standardization and trustworthy deployment of XAI explanations.
๐ Abstract
Given the broad adoption of artificial intelligence, it is essential to provide evidence that AI models are reliable, trustable, and fair. To this end, the emerging field of eXplainable AI develops techniques to probe such requirements, counterbalancing the hype pushing the pervasiveness of this technology. Among the many facets of this issue, this paper focuses on baseline attribution methods, aiming at deriving a feature attribution map at the network input relying on a"neutral"stimulus usually called"baseline". The choice of the baseline is crucial as it determines the explanation of the network behavior. In this framework, this paper has the twofold goal of shedding light on the implications of the choice of the baseline and providing a simple yet effective method for identifying the best baseline for the task. To achieve this, we propose a decision boundary sampling method, since the baseline, by definition, lies on the decision boundary, which naturally becomes the search domain. Experiments are performed on synthetic examples and validated relying on state-of-the-art methods. Despite being limited to the experimental scope, this contribution is relevant as it offers clear guidelines and a simple proxy for baseline selection, reducing ambiguity and enhancing deep models' reliability and trust.