🤖 AI Summary
The black-box nature of deep learning models in predictive process monitoring (PPM) undermines user trust, while existing XAI evaluation frameworks overemphasize technical metrics (e.g., fidelity) and neglect the impact of explanations on human decision-making.
Method: We conducted a pre-post human factors experiment—the first systematic investigation in PPM—to isolate and assess the interaction effects between three XAI explanation styles (feature importance, rule-based, and counterfactual) and users’ perceived AI accuracy (high vs. low). Explanations were generated using SHAP, RULES, and DICE, respectively, and evaluated via dual-dimensional metrics: objective (task performance, human-AI consistency) and subjective (decision confidence).
Contribution/Results: High perceived accuracy significantly increased decision confidence; counterfactual explanations yielded the greatest improvements in task performance and consistency; and a statistically significant interaction emerged between explanation style and perceived accuracy. This study bridges a critical gap by providing the first empirical, user-centered evaluation of XAI effectiveness in PPM.
📝 Abstract
Predictive Process Monitoring (PPM) often uses deep learning models to predict the future behavior of ongoing processes, such as predicting process outcomes. While these models achieve high accuracy, their lack of interpretability undermines user trust and adoption. Explainable AI (XAI) aims to address this challenge by providing the reasoning behind the predictions. However, current evaluations of XAI in PPM focus primarily on functional metrics (such as fidelity), overlooking user-centered aspects such as their effect on task performance and decision-making. This study investigates the effects of explanation styles (feature importance, rule-based, and counterfactual) and perceived AI accuracy (low or high) on decision-making in PPM. We conducted a decision-making experiment, where users were presented with the AI predictions, perceived accuracy levels, and explanations of different styles. Users'decisions were measured both before and after receiving explanations, allowing the assessment of objective metrics (Task Performance and Agreement) and subjective metrics (Decision Confidence). Our findings show that perceived accuracy and explanation style have a significant effect.