ProvX: Generating Counterfactual-Driven Attack Explanations for Provenance-Based Detection

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Graph neural network (GNN)-based intrusion detection models leveraging provenance graphs suffer from poor interpretability, hindering their utility in security analysis and decision-making. Method: This paper proposes ProvX, the first framework to formulate critical subgraph identification as a continuous optimization problem. It introduces a bi-objective loss function jointly optimizing for prediction flip and structural distance minimization, coupled with a phased consolidation strategy to enhance explanation precision and stability. ProvX generates counterfactual-driven attack explanations that precisely localize the minimal subgraph structures responsible for malicious classifications, enabling closed-loop validation across detection, explanation, and model feedback. Results: Experiments on authoritative datasets show ProvX achieves an average subgraph necessity of 51.59%, significantly outperforming existing state-of-the-art methods. Moreover, its explanations effectively guide model refinement and improve adversarial robustness.

Technology Category

Application Category

📝 Abstract

Provenance graph-based intrusion detection systems are deployed on hosts to defend against increasingly severe Advanced Persistent Threat. Using Graph Neural Networks to detect these threats has become a research focus and has demonstrated exceptional performance. However, the widespread adoption of GNN-based security models is limited by their inherent black-box nature, as they fail to provide security analysts with any verifiable explanations for model predictions or any evidence regarding the model's judgment in relation to real-world attacks. To address this challenge, we propose ProvX, an effective explanation framework for exlaining GNN-based security models on provenance graphs. ProvX introduces counterfactual explanation logic, seeking the minimal structural subset within a graph predicted as malicious that, when perturbed, can subvert the model's original prediction. We innovatively transform the discrete search problem of finding this critical subgraph into a continuous optimization task guided by a dual objective of prediction flipping and distance minimization. Furthermore, a Staged Solidification strategy is incorporated to enhance the precision and stability of the explanations. We conducted extensive evaluations of ProvX on authoritative datasets. The experimental results demonstrate that ProvX can locate critical graph structures that are highly relevant to real-world attacks and achieves an average explanation necessity of 51.59%, with these metrics outperforming current SOTA explainers. Furthermore, we explore and provide a preliminary validation of a closed-loop Detection-Explanation-Feedback enhancement framework, demonstrating through experiments that the explanation results from ProvX can guide model optimization, effectively enhancing its robustness against adversarial attacks.

Problem

Research questions and friction points this paper is trying to address.

Explaining black-box GNN-based security models on provenance graphs

Finding minimal structural subsets to flip malicious predictions

Enhancing model robustness through explanation-guided optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses counterfactual explanations for GNN-based security models

Transforms subgraph search into continuous optimization task

Implements Staged Solidification for precise stable explanations

🔎 Similar Papers

Explainable Artificial Intelligence (XAI) for Malware Analysis: A Survey of Techniques, Applications, and Open Challenges