Inference-Time Intervention in Large Language Models for Reliable Requirement Verification

📅 2025-03-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

186K/year
🤖 AI Summary
To address the insufficient accuracy and reliability of large language models (LLMs) in requirement verification within model-based systems engineering (MBSE), this paper proposes a lightweight, inference-time intervention paradigm. Our method dynamically modifies only 1–3 dedicated attention heads during forward propagation, integrates a graph neural network for structured representation of SysML/Capella models, and incorporates a self-consistency mechanism to enhance decision robustness. Crucially, it requires no parameter fine-tuning, thereby significantly reducing computational overhead. Experiments on early-phase Capella models of space mission systems demonstrate 100% precision in requirement satisfaction assessment—outperforming standard prompting, full-model fine-tuning, and existing intervention baselines. This work represents the first adaptation of attention-head-level fine-grained intervention to MBSE, establishing a novel, interpretable, high-accuracy, and easily integrable pathway for trustworthy AI-driven requirements engineering.

Technology Category

Application Category

📝 Abstract
Steering the behavior of Large Language Models (LLMs) remains a challenge, particularly in engineering applications where precision and reliability are critical. While fine-tuning and prompting methods can modify model behavior, they lack the dynamic and exact control necessary for engineering applications. Inference-time intervention techniques provide a promising alternative, allowing targeted adjustments to LLM outputs. In this work, we demonstrate how interventions enable fine-grained control for automating the usually time-intensive requirement verification process in Model-Based Systems Engineering (MBSE). Using two early-stage Capella SysML models of space missions with associated requirements, we apply the intervened LLMs to reason over a graph representation of the model to determine whether a requirement is fulfilled. Our method achieves robust and reliable outputs, significantly improving over both a baseline model and a fine-tuning approach. By identifying and modifying as few as one to three specialised attention heads, we can significantly change the model's behavior. When combined with self-consistency, this allows us to achieve perfect precision on our holdout test set.
Problem

Research questions and friction points this paper is trying to address.

Dynamic control of LLMs for precise engineering applications.
Automating requirement verification in Model-Based Systems Engineering.
Enhancing reliability and precision in LLM outputs via intervention.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inference-time intervention for precise LLM control
Graph-based reasoning for requirement verification
Modifying attention heads to enhance model behavior
🔎 Similar Papers
No similar papers found.