AutoML in Cybersecurity: An Empirical Study

📅 2025-09-27

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Current AutoML applications in cybersecurity lack systematic reliability evaluation, suffering from unstable tool performance, the absence of a universally optimal framework, overreliance on tree-based models prone to overfitting and poor interpretability, and challenges including adversarial vulnerability, model drift, and inadequate feature engineering. Method: This study conducts the first unified benchmark evaluation of eight open-source AutoML frameworks across eleven cybersecurity datasets—covering intrusion detection, malware classification, and phishing identification—and proposes a paradigm shift “from model selection to AutoML framework selection.” Results: No single framework dominates across all metrics; significant trade-offs exist among performance, efficiency, and automation capability; tree-model bias exacerbates interpretability and robustness bottlenecks. We derive actionable guidelines for trustworthy AutoML deployment and identify key future directions: enhancing adversarial robustness, dynamic adaptability, and interpretable automation.

Technology Category

Application Category

📝 Abstract

Automated machine learning (AutoML) has emerged as a promising paradigm for automating machine learning (ML) pipeline design, broadening AI adoption. Yet its reliability in complex domains such as cybersecurity remains underexplored. This paper systematically evaluates eight open-source AutoML frameworks across 11 publicly available cybersecurity datasets, spanning intrusion detection, malware classification, phishing, fraud detection, and spam filtering. Results show substantial performance variability across tools and datasets, with no single solution consistently superior. A paradigm shift is observed: the challenge has moved from selecting individual ML models to identifying the most suitable AutoML framework, complicated by differences in runtime efficiency, automation capabilities, and supported features. AutoML tools frequently favor tree-based models, which perform well but risk overfitting and limit interpretability. Key challenges identified include adversarial vulnerability, model drift, and inadequate feature engineering. We conclude with best practices and research directions to strengthen robustness, interpretability, and trust in AutoML for high-stakes cybersecurity applications.

Problem

Research questions and friction points this paper is trying to address.

Evaluating AutoML reliability across cybersecurity datasets

Assessing performance variability among AutoML frameworks

Addressing adversarial vulnerability and model interpretability challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates eight AutoML frameworks on cybersecurity datasets

Identifies performance variability and tree-based model preference

Proposes best practices for robustness and interpretability

🔎 Similar Papers

Large Language Models for Cyber Security: A Systematic Literature Review