🤖 AI Summary
This study addresses the strong model dependency of SHAP value interpretations, which lacks a standardized analytical framework and thereby limits reliable explanations of black-box model decisions in high-stakes applications. For the first time, this work systematically evaluates SHAP explanations across multiple mainstream machine learning models on diverse datasets, uncovering consistent patterns of model dependence. Furthermore, it proposes a generalized waterfall plot visualization method tailored for multi-class classification problems. Experimental results demonstrate the effectiveness and practical utility of the proposed approach, offering both theoretical grounding and actionable guidance for practitioners in the field of explainable artificial intelligence.
📝 Abstract
In this growing age of data and technology, large black-box models are becoming the norm due to their ability to handle vast amounts of data and learn incredibly complex data patterns. The deficiency of these methods, however, is their inability to explain the prediction process, making them untrustworthy and their use precarious in high-stakes situations. SHapley Additive exPlanations (SHAP) analysis is an explainable AI method growing in popularity for its ability to explain model predictions in terms of the original features. For each sample and feature in the data set, an associated SHAP value quantifies the contribution of that feature to the prediction of that sample. Analysis of these SHAP values provides valuable insight into the model's decision-making process, which can be leveraged to create data-driven solutions. The interpretation of these SHAP values, however, is model-dependent, so there does not exist a universal analysis procedure. To aid in these efforts, we present a detailed investigation of SHAP analysis across various machine learning models and data sets. In uncovering the details and nuance behind SHAP analysis, we hope to empower analysts in this less-explored territory. We also present a novel generalization of the waterfall plot to the multi-classification problem.