Are requirements really all you need? A case study of LLM-driven configuration code generation for automotive simulations

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether large language models (LLMs) can directly and robustly translate abstract, standards-derived requirements into executable CARLA simulation configuration code in automotive industrial settings. Methodologically, it conducts the first systematic end-to-end “requirement → code” evaluation of LLMs—employing prompt engineering, domain-knowledge injection, and CARLA API-constrained decoding—on open-source models including Llama and Mistral. Results show that while LLMs achieve 68% functional correctness on structured requirements, failure rates rise sharply to 41% when confronted with real-world industrial challenges such as ambiguous phrasing or cross-document references. The study reveals a significant gap between current LLMs’ high-level semantic understanding capabilities and practical industrial deployment readiness, underscoring the indispensable role of human oversight. Furthermore, it establishes a novel benchmark and methodological framework for evaluating domain-specific LLMs in safety-critical, standards-driven domains.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are taking many industries by storm. They possess impressive reasoning capabilities and are capable of handling complex problems, as shown by their steadily improving scores on coding and mathematical benchmarks. However, are the models currently available truly capable of addressing real-world challenges, such as those found in the automotive industry? How well can they understand high-level, abstract instructions? Can they translate these instructions directly into functional code, or do they still need help and supervision? In this work, we put one of the current state-of-the-art models to the test. We evaluate its performance in the task of translating abstract requirements, extracted from automotive standards and documents, into configuration code for CARLA simulations.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' ability to generate automotive simulation code
Assessing translation of abstract requirements into functional code
Testing LLM performance with high-level automotive instructions
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM translates abstract requirements into code
Automotive simulation configuration using LLM
Evaluating LLM performance in real-world tasks
🔎 Similar Papers
No similar papers found.
K
Krzysztof Lebioda
Technical University of Munich (TUM), School of Computation, Information and Technology (CIT), Chair of Robotics, Artificial Intelligence and Embedded Systems
Nenad Petrovic
Nenad Petrovic
Faculty of Electronic Engineering, University of Nis
Semantic TechnologyModel-Driven Software EngineeringDomain-Specific LanguagesLLM
F
F. Pan
Technical University of Munich (TUM), School of Computation, Information and Technology (CIT), Chair of Robotics, Artificial Intelligence and Embedded Systems
Vahid Zolfaghari
Vahid Zolfaghari
Technical University of Munich
Large Language ModelsAutonomous driving
André Schamschurko
André Schamschurko
Research assistant, Technical University of Munich
Large Language ModelsNatural Language Processing
Alois Knoll
Alois Knoll
Technische Universität München
RoboticsAISensor Data FusionAutonomous DrivingCyber Physical Systems