RvLLM: LLM Runtime Verification with Domain Knowledge

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) frequently commit low-level errors in high-stakes applications due to semantic incompleteness, and existing general-purpose detection methods fail to capture domain-specific biases. Method: We propose RvLLM, a lightweight, knowledge-augmented runtime verification framework. Its core innovation is ESL (Expert Specification Language), a domain-customizable declarative language enabling experts to formally encode semantic constraints; these are efficiently validated via a compact constraint parser and logical checker. Contribution/Results: RvLLM incurs minimal overhead (<50 ms per sample), supports flexible domain adaptation, and generalizes across tasks. Evaluated on traffic regulation violation detection, numerical comparison, and inequality solving, it consistently identifies LLM errors across diverse models, achieving significant accuracy gains—overcoming the fundamental limitation of generic detectors that ignore domain semantics.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have emerged as a dominant AI paradigm due to their exceptional text understanding and generation capabilities. However, their tendency to generate inconsistent or erroneous outputs challenges their reliability, especially in high-stakes domains requiring accuracy and trustworthiness. Existing research primarily focuses on detecting and mitigating model misbehavior in general-purpose scenarios, often overlooking the potential of integrating domain-specific knowledge. In this work, we advance misbehavior detection by incorporating domain knowledge. The core idea is to design a general specification language that enables domain experts to customize domain-specific predicates in a lightweight and intuitive manner, supporting later runtime verification of LLM outputs. To achieve this, we design a novel specification language, ESL, and introduce a runtime verification framework, RvLLM, to validate LLM output against domain-specific constraints defined in ESL. We evaluate RvLLM on three representative tasks: violation detection against Singapore Rapid Transit Systems Act, numerical comparison, and inequality solving. Experimental results demonstrate that RvLLM effectively detects erroneous outputs across various LLMs in a lightweight and flexible manner. The results reveal that despite their impressive capabilities, LLMs remain prone to low-level errors due to limited interpretability and a lack of formal guarantees during inference, and our framework offers a potential long-term solution by leveraging expert domain knowledge to rigorously and efficiently verify LLM outputs.
Problem

Research questions and friction points this paper is trying to address.

Detecting inconsistent or erroneous outputs in LLMs
Integrating domain-specific knowledge for misbehavior detection
Validating LLM outputs against domain-specific constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-specific predicates for LLM verification
Lightweight ESL specification language design
Runtime framework RvLLM validates LLM outputs
🔎 Similar Papers
No similar papers found.