🤖 AI Summary
To address the complexity and inefficiency of manually authoring Object Constraint Language (OCL) rules in Model-Based Systems Engineering (MBSE), this paper proposes an optimized Retrieval-Augmented Generation (RAG) framework tailored for OCL generation. Methodologically, we conduct the first systematic comparison of BM25, BERT-based dense retrieval, and SPLADE sparse retrieval for OCL generation, revealing SPLADE’s superior performance—particularly under low-recall settings (k=3). Integrated with large language model (LLM) generation and evaluated on the PathOCL graph benchmark, SPLADE achieves an F1 score of 0.72, outperforming both BM25 and the PathOCL baseline. Our key contribution lies in empirically demonstrating the suitability of sparse semantic retrieval for OCL generation, validating the critical role of a “few but high-precision” retrieval strategy in balancing relevance and consistency. This work provides empirical evidence and practical configuration guidelines for deploying RAG in MBSE contexts.
📝 Abstract
The Object Constraint Language (OCL) is essential for defining precise constraints within Model-Based Systems Engineering (MBSE). However, manually writing OCL rules is complex and time-consuming. This study explores the optimization of Retrieval-Augmented Generation (RAG) for automating OCL rule generation, focusing on the impact of different retrieval strategies. We evaluate three retrieval approaches $unicode{x2013}$ BM25 (lexical-based), BERT-based (semantic retrieval), and SPLADE (sparse-vector retrieval) $unicode{x2013}$ analyzing their effectiveness in providing relevant context for a large language model. To further assess our approach, we compare and benchmark our retrieval-optimized generation results against PathOCL, a state-of-the-art graph-based method. We directly compare BM25, BERT, and SPLADE retrieval methods with PathOCL to understand how different retrieval methods perform for a unified evaluation framework. Our experimental results, focusing on retrieval-augmented generation, indicate that while retrieval can enhance generation accuracy, its effectiveness depends on the retrieval method and the number of retrieved chunks (k). BM25 underperforms the baseline, whereas semantic approaches (BERT and SPLADE) achieve better results, with SPLADE performing best at lower k values. However, excessive retrieval with high k parameter can lead to retrieving irrelevant chunks which degrades model performance. Our findings highlight the importance of optimizing retrieval configurations to balance context relevance and output consistency. This research provides insights into improving OCL rule generation using RAG and underscores the need for tailoring retrieval.