🤖 AI Summary
In automotive software development, missing or erroneous traceability between stakeholder requirements and system requirements leads to consistency defects and compliance risks. Existing approaches rely on open-source datasets, lack industrial validation, and fail to address the high variability in automotive requirement formulations and the scarcity of labeled data. This paper presents the first traceability repair framework grounded in real-world automotive industry requirement data, integrating large language models (LLMs) with retrieval-augmented generation (RAG). The framework supports both manual link verification and automatic recovery of missing links, eliminating dependence on labeled data; it jointly models heterogeneous requirement expressions via semantic retrieval and generative reasoning. Experimental results demonstrate a 98.87% accuracy in link verification, an 85.50% correctness rate in missing-link recovery, and 97.13% robustness against unseen formulation variations.
📝 Abstract
In automotive software development, as well as other domains, traceability between stakeholder requirements and system requirements is crucial to ensure consistency, correctness, and regulatory compliance. However, erroneous or missing traceability relationships often arise due to improper propagation of requirement changes or human errors in requirement mapping, leading to inconsistencies and increased maintenance costs. Existing approaches do not address traceability between stakeholder and system requirements, rely on open-source data -- as opposed to automotive (or any industry) data -- and do not address the validation of manual links established by engineers. Additionally, automotive requirements often exhibit variations in the way they are expressed, posing challenges for supervised models requiring training. The recent advancements in large language models (LLMs) provide new opportunities to address these challenges. In this paper, we introduce TVR, a requirement Traceability Validation and Recovery approach primarily targeting automotive systems, leveraging LLMs enhanced with retrieval-augmented generation (RAG). TVR is designed to validate existing traceability links and recover missing ones with high accuracy. We empirically evaluate TVR on automotive requirements, achieving 98.87% accuracy in traceability validation and 85.50% correctness in traceability recovery. Additionally, TVR demonstrates strong robustness, achieving 97.13% in accuracy when handling unseen requirements variations. The results highlight the practical effectiveness of RAG-based LLM approaches in industrial settings, offering a promising solution for improving requirements traceability in complex automotive systems.