Pretraining Exposure Explains Popularity Judgments in Large Language Models

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This study investigates whether large language models’ preference for well-known entities stems from real-world popularity or statistical exposure in pretraining data. Leveraging the fully observable OLMo model and its 7.4 trillion-token Dolma pretraining corpus, the authors quantify exposure frequencies for 2,000 entities and systematically compare them against Wikipedia pageview counts and the model’s own popularity judgments. Through entity-level exposure statistics, scalar estimation, pairwise comparisons, and correlation analyses, they demonstrate that the model’s popularity bias aligns closely with pretraining exposure—particularly for long-tail entities and at larger model scales—providing strong evidence that data exposure is the dominant factor shaping such preferences.

📝 Abstract

Large language models (LLMs) exhibit systematic preferences for well-known entities, a phenomenon often attributed to popularity bias. However, the extent to which these preferences reflect real-world popularity versus statistical exposure during pretraining remains unclear, largely due to the inaccessibility of most training corpora. We provide the first direct, large-scale analysis of popularity bias grounded in fully observable pretraining data. Leveraging the open OLMo models and their complete pretraining corpus, Dolma, we compute precise entity-level exposure statistics across 7.4 trillion tokens. We analyze 2,000 entities spanning five types (Person, Location, Organization, Art, Product) and compare pretraining exposure against Wikipedia pageviews and two elicited LLM popularity signals: direct scalar estimation and pairwise comparison. Our results show that pretraining exposure strongly correlates with Wikipedia popularity, validating exposure as a meaningful proxy for real-world salience during the training period. More importantly, we find that LLM popularity judgments align more closely with exposure than with Wikipedia, especially when elicited via pairwise comparisons. This alignment is strongest for larger models and persists in the long tail, where Wikipedia popularity becomes unreliable. Overall, our findings demonstrate that popularity priors in LLMs are primarily shaped by pretraining statistics rather than external popularity signals, offering concrete evidence that data exposure plays a central role in driving popularity bias.

Problem

Research questions and friction points this paper is trying to address.

popularity bias

pretraining exposure

large language models

entity popularity

data exposure

Innovation

Methods, ideas, or system contributions that make the work stand out.

pretraining exposure

popularity bias

large language models