Are requirements really all you need? A case study of LLM-driven configuration code generation for automotive simulations

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work investigates whether large language models (LLMs) can directly and robustly translate abstract, standards-derived requirements into executable CARLA simulation configuration code in automotive industrial settings. Methodologically, it conducts the first systematic end-to-end “requirement → code” evaluation of LLMs—employing prompt engineering, domain-knowledge injection, and CARLA API-constrained decoding—on open-source models including Llama and Mistral. Results show that while LLMs achieve 68% functional correctness on structured requirements, failure rates rise sharply to 41% when confronted with real-world industrial challenges such as ambiguous phrasing or cross-document references. The study reveals a significant gap between current LLMs’ high-level semantic understanding capabilities and practical industrial deployment readiness, underscoring the indispensable role of human oversight. Furthermore, it establishes a novel benchmark and methodological framework for evaluating domain-specific LLMs in safety-critical, standards-driven domains.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are taking many industries by storm. They possess impressive reasoning capabilities and are capable of handling complex problems, as shown by their steadily improving scores on coding and mathematical benchmarks. However, are the models currently available truly capable of addressing real-world challenges, such as those found in the automotive industry? How well can they understand high-level, abstract instructions? Can they translate these instructions directly into functional code, or do they still need help and supervision? In this work, we put one of the current state-of-the-art models to the test. We evaluate its performance in the task of translating abstract requirements, extracted from automotive standards and documents, into configuration code for CARLA simulations.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' ability to generate automotive simulation code

Assessing translation of abstract requirements into functional code

Testing LLM performance with high-level automotive instructions

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM translates abstract requirements into code

Automotive simulation configuration using LLM

Evaluating LLM performance in real-world tasks

🔎 Similar Papers

No similar papers found.