Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

222K/year
🤖 AI Summary
This study addresses the limited understanding of how factual recall performance in large language models relates predictably to model scale and the thematic distribution of training data. By evaluating 38 models on over 8,900 academic citations, the work establishes the first sigmoidal scaling law linking factual recall accuracy to a log-linear combination of model parameter count and topic frequency in the training corpus. This law explains 60% of performance variance across model families and 74–94% within individual families. To account for this phenomenon, the authors propose a signal-to-noise-ratio-based hyperpositional mechanism. Integrating automated citation verification, large-scale evaluation, and statistical modeling, the study introduces a novel paradigm for the predictable modeling of factuality in language models.
📝 Abstract
While scaling laws govern aggregate large language model performance, no scaling law has linked factual recall to both model size and training-data composition. We evaluated 38 models on over 8,900 scholarly references evaluated by an automated reference verification system. Recall quality follows a sigmoid in the log-linear combination of model parameter count and topic representation in training data. These two variables alone explain 60% of the variance across 16 dense models from four families, rising to 74-94% within individual families. The form matches a superposition-inspired account in which recall is gated by a signal-to-noise ratio: signal strength scales with concept frequency and the noise floor with model capacity.
Problem

Research questions and friction points this paper is trying to address.

factual recall
scaling laws
model size
training data composition
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

scaling laws
factual recall
model size
topic frequency
signal-to-noise ratio
🔎 Similar Papers
No similar papers found.