STAF: Leveraging LLMs for Automated Attack Tree-Based Security Test Generation

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Low efficiency, high error rates, and insufficient automation plague the conversion of attack trees to test cases in automotive safety testing. Method: This paper proposes the first LLM-driven security test case generation framework tailored for in-vehicle systems. It employs a four-step self-correcting retrieval-augmented generation (RAG) pipeline—comprising attack tree parsing, semantic retrieval, multi-round logical validation, and executable test script generation—to achieve end-to-end automation. A domain-knowledge-guided self-correction strategy ensures generated test cases are semantically sound, safety-compliant, and executable, while tightly integrating attack tree analysis with automated test verification. Contribution/Results: Experiments demonstrate substantial improvements: 12× higher generation efficiency over manual methods, 94.3% accuracy, and strong cross-model compatibility (supporting GPT-4.1, DeepSeek, etc.). The framework has been validated on real-world automotive ECUs.

Technology Category

Application Category

📝 Abstract
In modern automotive development, security testing is critical for safeguarding systems against increasingly advanced threats. Attack trees are widely used to systematically represent potential attack vectors, but generating comprehensive test cases from these trees remains a labor-intensive, error-prone task that has seen limited automation in the context of testing vehicular systems. This paper introduces STAF (Security Test Automation Framework), a novel approach to automating security test case generation. Leveraging Large Language Models (LLMs) and a four-step self-corrective Retrieval-Augmented Generation (RAG) framework, STAF automates the generation of executable security test cases from attack trees, providing an end-to-end solution that encompasses the entire attack surface. We particularly show the elements and processes needed to provide an LLM to actually produce sensible and executable automotive security test suites, along with the integration with an automated testing framework. We further compare our tailored approach with general purpose (vanilla) LLMs and the performance of different LLMs (namely GPT-4.1 and DeepSeek) using our approach. We also demonstrate the method of our operation step-by-step in a concrete case study. Our results show significant improvements in efficiency, accuracy, scalability, and easy integration in any workflow, marking a substantial advancement in automating automotive security testing methodologies. Using TARAs as an input for verfication tests, we create synergies by connecting two vital elements of a secure automotive development process.
Problem

Research questions and friction points this paper is trying to address.

Automating labor-intensive security test generation from attack trees for vehicles
Overcoming limited automation in creating executable automotive security test cases
Addressing error-prone manual processes in vehicular system security testing
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based automated security test generation
Four-step self-corrective RAG framework
End-to-end attack tree to executable tests
🔎 Similar Papers
No similar papers found.
T
Tanmay Khule
Department of Systems Design Engineering, University of Waterloo, ON Canada
S
Stefan Marksteiner
Smart Calibration and Virtual Testing Department, AVL List GmbH, Graz, Austria
J
Jose Alguindigue
Department of Systems Design Engineering, University of Waterloo, ON Canada
H
Hannes Fuchs
Smart Calibration and Virtual Testing Department, AVL List GmbH, Graz, Austria
S
Sebastian Fischmeister
Department of Systems Design Engineering, University of Waterloo, ON Canada
Apurva Narayan
Apurva Narayan
Western University, University of British Columbia and University of Waterloo
Data AnalyticsMachine LearningAI for Social GoodSafety and Security in CPS