🤖 AI Summary
Existing approaches to generating rare critical scenarios for autonomous driving planners suffer from low efficiency, heavy reliance on large-scale data collection, or labor-intensive manual design. Method: This paper proposes a traffic scenario augmentation framework based on a large language model (LLM) agent. Leveraging natural language instruction understanding and reasoning, the framework enables semantic-controllable enhancement of real-world traffic scenes—generating high-fidelity, challenging, and rare driving situations efficiently while preserving the original data distribution. The agent architecture supports fine-grained procedural control, enabling even lightweight LLMs to achieve robust performance. Contribution/Results: Expert evaluation shows that the generated scenarios match human-designed ones in intent alignment and quality, significantly outperforming conventional data-driven methods. The framework demonstrates strong practical potential for large-scale automated testing of autonomous driving systems.
📝 Abstract
Rare, yet critical, scenarios pose a significant challenge in testing and evaluating autonomous driving planners. Relying solely on real-world driving scenes requires collecting massive datasets to capture these scenarios. While automatic generation of traffic scenarios appears promising, data-driven models require extensive training data and often lack fine-grained control over the output. Moreover, generating novel scenarios from scratch can introduce a distributional shift from the original training scenes which undermines the validity of evaluations especially for learning-based planners. To sidestep this, recent work proposes to generate challenging scenarios by augmenting original scenarios from the test set. However, this involves the manual augmentation of scenarios by domain experts. An approach that is unable to meet the demands for scale in the evaluation of self-driving systems. Therefore, this paper introduces a novel LLM-agent based framework for augmenting real-world traffic scenarios using natural language descriptions, addressing the limitations of existing methods. A key innovation is the use of an agentic design, enabling fine-grained control over the output and maintaining high performance even with smaller, cost-effective LLMs. Extensive human expert evaluation demonstrates our framework's ability to accurately adhere to user intent, generating high quality augmented scenarios comparable to those created manually.