When Models Ignore Definitions: Measuring Semantic Override Hallucinations in LLM Reasoning

📅 2026-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical limitation of large language models (LLMs) in formal reasoning: their tendency to exhibit “semantic override” hallucinations when confronted with locally redefined semantics—such as custom logic gates or operators—due to overreliance on pretraining priors, thereby disregarding temporary definitions provided in prompts. The study formally defines and quantifies two error types, semantic override and assumption injection, and introduces a micro-benchmark comprising 30 logical and digital circuit reasoning tasks. Using validator-style trap tasks that encompass Boolean algebra and operator overloading, the authors evaluate model adherence to local semantic specifications. Experimental results reveal that mainstream LLMs consistently ignore contextual definitions, introduce undeclared assumptions, and omit critical constraints—even in simple tasks—highlighting fundamental shortcomings in their capacity for rigorous formal reasoning.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) demonstrate strong performance on standard digital logic and Boolean reasoning tasks, yet their reliability under locally redefined semantics remains poorly understood. In many formal settings, such as circuit specifications, examinations, and hardware documentation, operators and components are explicitly redefined within narrow scope. Correct reasoning in these contexts requires models to temporarily suppress globally learned conventions in favor of prompt-local definitions. In this work, we study a systematic failure mode we term semantic override, in which an LLM reverts to its pretrained default interpretation of operators or gate behavior despite explicit redefinition in the prompt. We also identify a related class of errors, assumption injection, where models commit to unstated hardware semantics when critical details are underspecified, rather than requesting clarification. We introduce a compact micro-benchmark of 30 logic and digital-circuit reasoning tasks designed as verifier-style traps, spanning Boolean algebra, operator overloading, redefined gates, and circuit-level semantics. Evaluating three frontier LLMs, we observe persistent noncompliance with local specifications, confident but incompatible assumptions, and dropped constraints even in elementary settings. Our findings highlight a gap between surface-level correctness and specification-faithful reasoning, motivating evaluation protocols that explicitly test local unlearning and semantic compliance in formal domains.
Problem

Research questions and friction points this paper is trying to address.

semantic override
hallucination
local redefinition
LLM reasoning
specification compliance
Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic override
assumption injection
local semantic compliance
formal reasoning
micro-benchmark
🔎 Similar Papers
No similar papers found.
Y
Yogeswar Reddy Thota
Dept. of Electrical and Computer Engineering, University of Texas at Dallas, Richardson, TX, USA
Setareh Rafatirad
Setareh Rafatirad
Associate Professor, Computer Science Department, University of California Davis
Mobile SecurityEdge Device TrustApplied Machine LearningCybersecurityHW/SW Co-Design
H
Homayoun Houman
Dept. of Electrical and Computer Engineering, University of California Davis, CA, USA
T
Tooraj Nikoubin
Dept. of Electrical and Computer Engineering, University of Texas at Dallas, Richardson, TX, USA