🤖 AI Summary
This work addresses the challenges in functional testing of software-defined vehicles, where natural language requirements are often ambiguous, specifications are heterogeneous, and test assets are fragmented. To overcome these issues, the paper presents the first end-to-end framework that integrates large language models (LLMs) and vision-language models (VLMs) to automatically extract signal and behavioral logic from multimodal requirements. The framework generates Gherkin-style test scenarios and translates them into executable test scripts compliant with the Vehicle Signal Specification (VSS) standard, validated in both simulation and real-vehicle environments. Retrieval-augmented generation (RAG) is employed to enhance cross-platform portability. Evaluated on a child presence detection system, the approach successfully converted 89% (32 out of 36) of requirements into executable tests, demonstrating its feasibility and effectiveness.
📝 Abstract
Testing functionality in Software-Defined Vehicles is challenging because requirements are written in natural language, specifications combine text, tables, and diagrams, while test assets are scattered across heterogeneous toolchains. Large Language Models and Vision-Language Models are used to extract signals and behavioral logic to automatically generate Gherkin scenarios, which are then converted into runnable test scripts. The Vehicle Signal Specification (VSS) integration standardizes signal references, supporting portability across subsystems and test benches. The pipeline uses retrieval-augmented generation to preselect candidate VSS signals before mapping. We evaluate the approach on the safety-relevant Child Presence Detection System, executing the generated tests in a virtual environment and on an actual vehicle. Our evaluation covers Gherkin validity, VSS mapping quality, and end-to-end executability. Results show that 32 of 36 requirements (89\%) can be transformed into executable scenarios in our setting, while human review and targeted substitutions remain necessary. This paper is a feasibility and architectural demonstration of an end-to-end requirements-to-test pipeline for SDV subsystems, evaluated on a CPDS case in simulation and Vehicle-in-the-Loop settings.