STADA: Specification-based Testing for Autonomous Driving Agents

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Existing simulation-based testing methods for autonomous driving often rely on templates, manual construction, or random generation when verifying formal safety requirements, leading to inefficient coverage and potential omission of critical scenarios. This work proposes the first framework that directly translates LTLf specifications into a structured, exhaustive test scenario space. By integrating the SCENEFLOW language with a systematic generation strategy, the approach enables efficient and comprehensive coverage of behaviors mandated by the specification. The method drastically reduces manual intervention, achieving up to twice the test coverage of baseline approaches across diverse LTLf specifications, improving coarse-grained metrics by 75%, and attaining equivalent coverage with only one-sixth the number of simulations.

Technology Category

Application Category

📝 Abstract

Simulation-based testing has become a standard approach to validating autonomous driving agents prior to real-world deployment. A high-quality validation campaign will exercise an agent in diverse contexts comprised of varying static environments, e.g., lanes, intersections, signage, and dynamic elements, e.g., vehicles and pedestrians. To achieve this, existing test generation techniques rely on template-based, manually constructed, or random scenario generation. When applied to validate formally specified safety requirements, such methods either require significant human effort or run the risk of missing important behavior related to the requirement. To address this gap, we present STADA, a Specification-based Test generation framework for Autonomous Driving Agents that systematically generates the space of scenarios defined by a formal specification expressed in temporal logic (LTLf). Given a specification, STADA constructs all distinct initial scenes, a diverse space of continuations of those scenes, and simulations that reflect the behaviors of the specification. Evaluation of STADA on a variety of LTLf specifications formalized in SCENEFLOW using three complementary coverage criteria demonstrates that STADA yields more than 2x higher coverage than the best baseline on the finest criteria and a 75% increase for the coarsest criteria. Moreover, it matches the coverage of the best baseline with 6 times fewer simulations. While set in the context of autonomous driving, the approach is applicable to other domains with rich simulation environments.

Problem

Research questions and friction points this paper is trying to address.

autonomous driving

specification-based testing

scenario generation

formal verification

simulation-based validation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Specification-based testing

Autonomous driving agents

LTLf