🤖 AI Summary
Insufficient interpretability of machine learning models in software engineering (SE), particularly undermining decision transparency in critical tasks such as vulnerability detection.
Method: We conduct a systematic literature review (SLR) analyzing 108 peer-reviewed papers spanning 23 SE tasks.
Contribution/Results: This study establishes the first comprehensive XAI (Explainable AI) landscape for SE, identifying six high-value application scenarios and seven mainstream technical approaches. It reveals critical gaps in evaluation methodologies—especially the absence of structured, empirically grounded benchmarks for SE-XAI. Furthermore, we synthesize a prioritized list of industrial deployment challenges and actionable best practices. By bridging the gap between theoretical XAI research and SE practice, this work provides a foundational, evidence-based reference for both academic investigation and real-world engineering adoption of interpretable ML in software development and assurance.
📝 Abstract
The remarkable achievements of Artificial Intelligence (AI) algorithms, particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their extensive deployment across multiple sectors, including Software Engineering (SE). However, due to their black-box nature, these promising AI-driven SE models are still far from being deployed in practice. This lack of explainability poses unwanted risks for their applications in critical tasks, such as vulnerability detection, where decision-making transparency is of paramount importance. This paper endeavors to elucidate this interdisciplinary domain by presenting a systematic literature review of approaches that aim to improve the explainability of AI models within the context of SE. The review canvasses work appearing in the most prominent SE&AI conferences and journals, and spans 108 papers across 23 unique SE tasks. Based on three key Research Questions (RQs), we aim to (1) summarize the SE tasks where XAI techniques have shown success to date; (2) classify and analyze different XAI techniques; and (3) investigate existing evaluation approaches. Based on our findings, we identified a set of challenges remaining to be addressed in existing studies, together with a set of guidelines highlighting potential opportunities we deemed appropriate and important for future work.