🤖 AI Summary
Existing DoS defense mechanisms struggle against AI-driven adaptive attacks. This paper proposes AdaDoS—the first deep adversarial reinforcement learning framework for generating adaptive DoS attacks that evade both machine learning–based and rule-based detectors. Methodologically, AdaDoS formulates the attack–detection interaction as a partially observable Markov decision process (POMDP), capturing the dynamic adversarial博弈 between attacker and detector. It introduces a teacher–student co-learning mechanism to enable policy distillation and efficient exploration under partial observability. Crucially, it is the first work to apply reinforcement learning for synthesizing DoS attack sequences capable of evading multiple heterogeneous detectors. Extensive experiments across diverse SDN environments demonstrate that AdaDoS significantly enhances attack stealthiness and adaptivity, successfully bypassing state-of-the-art detection systems while maintaining high evasion rates under real-time constraints.
📝 Abstract
Existing defence mechanisms have demonstrated significant effectiveness in mitigating rule-based Denial-of-Service (DoS) attacks, leveraging predefined signatures and static heuristics to identify and block malicious traffic. However, the emergence of AI-driven techniques presents new challenges to SDN security, potentially compromising the efficacy of existing defence mechanisms. In this paper, we introduce~AdaDoS, an adaptive attack model that disrupt network operations while evading detection by existing DoS-based detectors through adversarial reinforcement learning (RL). Specifically, AdaDoS models the problem as a competitive game between an attacker, whose goal is to obstruct network traffic without being detected, and a detector, which aims to identify malicious traffic. AdaDoS can solve this game by dynamically adjusting its attack strategy based on feedback from the SDN and the detector. Additionally, recognising that attackers typically have less information than defenders, AdaDoS formulates the DoS-like attack as a partially observed Markov decision process (POMDP), with the attacker having access only to delay information between attacker and victim nodes. We address this challenge with a novel reciprocal learning module, where the student agent, with limited observations, enhances its performance by learning from the teacher agent, who has full observational capabilities in the SDN environment. AdaDoS represents the first application of RL to develop DoS-like attack sequences, capable of adaptively evading both machine learning-based and rule-based DoS-like attack detectors.