🤖 AI Summary
This work addresses the limitations of existing approaches in autonomous driving simulation testing, where manually authored scenarios lack scalability and statistical models struggle to precisely control interactive behaviors in out-of-distribution settings. The authors propose modeling traffic scenario orchestration as a constraint satisfaction problem, uniquely integrating large language models (LLMs) with closed-loop simulation. Specifically, natural language descriptions are automatically translated by an LLM into formal constraints, which are then processed by off-the-shelf solvers to generate closed-loop scenarios aligned with test intents. Evaluated on diverse benchmarks, the method significantly outperforms baseline approaches, achieving markedly higher scenario generation success rates. The results underscore the critical role of closed-loop mechanisms in handling ego-vehicle-responsive complex scenarios and demonstrate a unified framework that reconciles high controllability with scalability.
📝 Abstract
Autonomous vehicles (AVs) require extensive testing in simulation, but test case generation for driving scenarios is laborious. The desired scenarios are often out-of-distribution and have precise requirements on interactions with the AV policy under test. Manually programming scenarios allows for precise controllability but is difficult to scale. On the other hand, statistical models can leverage compute and data, but struggle with precise controllability when out-of-distribution. We cast scenario orchestration as a constraint-solving problem and present a language-in, simulation-out scenario orchestrator for closed-loop testing AVs. Our approach leverages foundation model reasoning to translate general, natural language descriptions into a set of constraints as a scenario representation. This then allows us to leverage off the shelf solvers to solve for actor behaviors which meet precise testing intentions in closed-loop. Under a benchmark of carefully crafted and diverse scenario descriptions, our approach greatly outperforms our baselines in orchestration success rate. We further show that our closed-loop approach is especially important for scenarios which require ego-reactive specifications.