🤖 AI Summary
This study investigates how architectural inductive biases—specifically between Transformer (self-attention-based) and recurrent architectures—affect hallucination generation mechanisms in large language models (LLMs).
Method: We design a controlled experimental framework integrating multi-dimensional prompt engineering, targeted hallucination injection, and cross-architectural benchmarking to systematically compare hallucinations across location, semantic type, and inducibility.
Contribution/Results: For the first time, we bridge hallucination analysis with non-Transformer architectural evolution—including attention-free alternatives—and demonstrate that architecture choice fundamentally reshapes hallucination distribution patterns and triggering mechanisms. While hallucinations remain pervasive, they exhibit statistically significant positional preferences, semantic type biases, and differential robustness across architectures. These findings provide critical empirical evidence and theoretical grounding for developing architecture-aware, generalizable hallucination mitigation strategies.
📝 Abstract
The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some of this is also owed to the risks and costs associated with their use. On one front is their tendency to hallucinate false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.