STAF: Leveraging LLMs for Automated Attack Tree-Based Security Test Generation

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Low efficiency, high error rates, and insufficient automation plague the conversion of attack trees to test cases in automotive safety testing. Method: This paper proposes the first LLM-driven security test case generation framework tailored for in-vehicle systems. It employs a four-step self-correcting retrieval-augmented generation (RAG) pipeline—comprising attack tree parsing, semantic retrieval, multi-round logical validation, and executable test script generation—to achieve end-to-end automation. A domain-knowledge-guided self-correction strategy ensures generated test cases are semantically sound, safety-compliant, and executable, while tightly integrating attack tree analysis with automated test verification. Contribution/Results: Experiments demonstrate substantial improvements: 12× higher generation efficiency over manual methods, 94.3% accuracy, and strong cross-model compatibility (supporting GPT-4.1, DeepSeek, etc.). The framework has been validated on real-world automotive ECUs.

Technology Category

Application Category

📝 Abstract

In modern automotive development, security testing is critical for safeguarding systems against increasingly advanced threats. Attack trees are widely used to systematically represent potential attack vectors, but generating comprehensive test cases from these trees remains a labor-intensive, error-prone task that has seen limited automation in the context of testing vehicular systems. This paper introduces STAF (Security Test Automation Framework), a novel approach to automating security test case generation. Leveraging Large Language Models (LLMs) and a four-step self-corrective Retrieval-Augmented Generation (RAG) framework, STAF automates the generation of executable security test cases from attack trees, providing an end-to-end solution that encompasses the entire attack surface. We particularly show the elements and processes needed to provide an LLM to actually produce sensible and executable automotive security test suites, along with the integration with an automated testing framework. We further compare our tailored approach with general purpose (vanilla) LLMs and the performance of different LLMs (namely GPT-4.1 and DeepSeek) using our approach. We also demonstrate the method of our operation step-by-step in a concrete case study. Our results show significant improvements in efficiency, accuracy, scalability, and easy integration in any workflow, marking a substantial advancement in automating automotive security testing methodologies. Using TARAs as an input for verfication tests, we create synergies by connecting two vital elements of a secure automotive development process.

Problem

Research questions and friction points this paper is trying to address.

Automating labor-intensive security test generation from attack trees for vehicles

Overcoming limited automation in creating executable automotive security test cases

Addressing error-prone manual processes in vehicular system security testing

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based automated security test generation

Four-step self-corrective RAG framework

End-to-end attack tree to executable tests

🔎 Similar Papers

No similar papers found.