🤖 AI Summary
Existing cybersecurity response playbooks are predominantly heterogeneous and non-machine-readable, severely impeding automation and interoperability in SOAR platforms. To address this, we propose a modular, LLM-driven translation framework leveraging prompt engineering to automatically convert unstructured playbooks into standardized CACAO format. Our method innovatively integrates syntactic validation, semantic fidelity constraints, and multi-turn iterative refinement to ensure accurate structural reconstruction of complex control flows and preservation of operational semantics. Evaluated on a curated benchmark dataset, our approach significantly outperforms baseline models: it reduces CACAO syntax error rate by 72%, improves recall of critical procedural nodes by 39%, and supports end-to-end deployment. This work establishes a reusable technical pathway for structured representation and automated orchestration of security knowledge, advancing the operationalization of human-authored playbooks in autonomous security systems.
📝 Abstract
Existing cybersecurity playbooks are often written in heterogeneous, non-machine-readable formats, which limits their automation and interoperability across Security Orchestration, Automation, and Response platforms. This paper explores the suitability of Large Language Models, combined with Prompt Engineering, to automatically translate legacy incident response playbooks into the standardized, machine-readable CACAO format. We systematically examine various Prompt Engineering techniques and carefully design prompts aimed at maximizing syntactic accuracy and semantic fidelity for control flow preservation. Our modular transformation pipeline integrates a syntax checker to ensure syntactic correctness and features an iterative refinement mechanism that progressively reduces syntactic errors. We evaluate the proposed approach on a custom-generated dataset comprising diverse legacy playbooks paired with manually created CACAO references. The results demonstrate that our method significantly improves the accuracy of playbook transformation over baseline models, effectively captures complex workflow structures, and substantially reduces errors. It highlights the potential for practical deployment in automated cybersecurity playbook transformation tasks.