On the Effect of Ruleset Tuning and Data Imbalance on Explainable Network Security Alert Classifications: a Case-Study on DeepCASE

📅 2025-07-02

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This paper addresses the label imbalance problem in network intrusion alert classification within automated Security Operations Centers (SOCs), systematically examining its dual adverse impact on both classification performance and explanation fidelity of the state-of-the-art method DeepCASE. We find that class imbalance not only degrades classification accuracy but also substantially undermines the reliability of model explanations. To mitigate this, we propose a data quality enhancement method that leverages existing SOC detection rules to guide synthetic sample generation and label correction—requiring no architectural modification to DeepCASE. Experiments demonstrate that our approach significantly improves both classification accuracy and explanation consistency while preserving DeepCASE’s original framework. The results validate “rule-driven data governance” as an effective strategy for enhancing the robustness and interpretability of automated alert classification, offering a novel pathway to balance security efficacy with operational trustworthiness.

Technology Category

Application Category

📝 Abstract

Automation in Security Operations Centers (SOCs) plays a prominent role in alert classification and incident escalation. However, automated methods must be robust in the presence of imbalanced input data, which can negatively affect performance. Additionally, automated methods should make explainable decisions. In this work, we evaluate the effect of label imbalance on the classification of network intrusion alerts. As our use-case we employ DeepCASE, the state-of-the-art method for automated alert classification. We show that label imbalance impacts both classification performance and correctness of the classification explanations offered by DeepCASE. We conclude tuning the detection rules used in SOCs can significantly reduce imbalance and may benefit the performance and explainability offered by alert post-processing methods such as DeepCASE. Therefore, our findings suggest that traditional methods to improve the quality of input data can benefit automation.

Problem

Research questions and friction points this paper is trying to address.

Evaluating label imbalance impact on network intrusion alert classification

Assessing DeepCASE performance and explanation correctness under imbalance

Exploring ruleset tuning to reduce imbalance and enhance automation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tuning rulesets to reduce data imbalance

Using DeepCASE for automated alert classification

Enhancing explainability in network security alerts

🔎 Similar Papers

Explainable Artificial Intelligence (XAI) for Malware Analysis: A Survey of Techniques, Applications, and Open Challenges