🤖 AI Summary
This work addresses the challenge of conflicting post-hoc explanations—such as those generated by LIME, SHAP, and BreakDown—in software defect prediction, which can increase developers’ cognitive load and erode trust in model predictions. To mitigate this issue, the authors propose a developer-oriented explainable AI aggregation method that uniquely integrates human-centered design with rank-aware mechanisms. By leveraging adaptive thresholds, sign and ranking consistency analysis, and a fallback strategy, the approach synthesizes multiple explanation sources into a unified, coherent view. The resulting aggregated explanations are embedded within a VS Code plugin to support real-time defect inspection. User studies demonstrate that nearly 90% of developers prefer this aggregated form of explanation, reporting substantially reduced confusion and enhanced effectiveness in everyday debugging tasks.
📝 Abstract
Machine learning (ML)-based defect prediction models can improve software quality. However, their opaque reasoning creates an HCI challenge because developers struggle to trust models they cannot interpret. Explainable AI (XAI) methods such as LIME, SHAP, and BreakDown aim to provide transparency, but when used together, they often produce conflicting explanations that increase confusion, frustration, and cognitive load. To address this usability challenge, we introduce XMENTOR, a human-centered, rank-aware aggregation method implemented as a VS Code plugin. XMENTOR unifies multiple post-hoc explanations into a single, coherent view by applying adaptive thresholding, rank and sign agreement, and fallback strategies to preserve clarity without overwhelming users. In a user study, nearly 90% of the participants preferred aggregated explanations, citing reduced confusion and stronger support for daily tasks of debugging and review of defects. Our findings show how combining explanations and embedding them into developer workflows can enhance interpretability, usability, and trust.