Smarter, not Bigger: Fine-Tuned RAG-Enhanced LLMs for Automotive HIL Testing

📅 2025-11-27

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the fragmentation, low reusability, and development inefficiency of test assets in automotive Hardware-in-the-Loop (HIL) testing, this paper proposes HIL-GPT: a domain-adapted lightweight large language model (LLM) integrating Retrieval-Augmented Generation (RAG) and a fine-tuned semantic embedding model to enable traceable, bidirectional retrieval between requirements and test cases. Methodologically, we introduce a novel data curation pipeline combining heuristic mining and LLM-based synthesis to construct a high-quality, domain-specific dataset. We empirically validate that compact models achieve an optimal trade-off among accuracy, inference latency, and deployment cost—challenging the prevailing “bigger is better” assumption. A/B experiments demonstrate that HIL-GPT significantly outperforms general-purpose LLMs in practical utility, result reliability, and user satisfaction.

Technology Category

Application Category

📝 Abstract

Hardware-in-the-Loop (HIL) testing is essential for automotive validation but suffers from fragmented and underutilized test artifacts. This paper presents HIL-GPT, a retrieval-augmented generation (RAG) system integrating domain-adapted large language models (LLMs) with semantic retrieval. HIL-GPT leverages embedding fine-tuning using a domain-specific dataset constructed via heuristic mining and LLM-assisted synthesis, combined with vector indexing for scalable, traceable test case and requirement retrieval. Experiments show that fine-tuned compact models, such as exttt{bge-base-en-v1.5}, achieve a superior trade-off between accuracy, latency, and cost compared to larger models, challenging the notion that bigger is always better. An A/B user study further confirms that RAG-enhanced assistants improve perceived helpfulness, truthfulness, and satisfaction over general-purpose LLMs. These findings provide insights for deploying efficient, domain-aligned LLM-based assistants in industrial HIL environments.

Problem

Research questions and friction points this paper is trying to address.

Enhances automotive HIL test artifact retrieval and utilization

Optimizes accuracy, latency, and cost with fine-tuned compact models

Improves perceived helpfulness and truthfulness in industrial assistants

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned RAG system integrates domain-adapted LLMs with semantic retrieval

Embedding fine-tuning uses domain-specific dataset from heuristic mining and synthesis

Vector indexing enables scalable, traceable test case and requirement retrieval

🔎 Similar Papers

No similar papers found.