🤖 AI Summary
This study addresses the underexplored issue of *agentic editorial bias*—systematic, implicit information curation—when large language models (LLMs) function as news gatekeepers. Method: We conduct the first systematic audit of four state-of-the-art LLMs (GPT-4o-Mini, Claude-3.7-Sonnet, Gemini-2.0-Flash) against Google News, employing a multi-layered algorithmic framework integrating topic-based querying, media outlet classification, ideological positioning, and factual accuracy assessment—rigorously validated across diverse prompting strategies and reliability benchmarks. Results: All LLMs exhibit statistically significant, robust ideological skew and uneven attention allocation: they amplify ideologically aligned outlets while suppressing others, yielding lower media diversity and narrower exposure sets than conventional news aggregators. Crucially, models differ markedly in directional bias. We introduce the concept of *agentic editorial policy* to formalize LLMs’ latent, systemic filtering mechanisms—revealing their emergent role as high-stakes news intermediaries with substantial information manipulation potential. This work provides foundational empirical evidence and a theoretical framework for LLM content governance.
📝 Abstract
Large Language Models (LLMs) increasingly act as gateways to web content, shaping how millions of users encounter online information. Unlike traditional search engines, whose retrieval and ranking mechanisms are well studied, the selection processes of web-connected LLMs add layers of opacity to how answers are generated. By determining which news outlets users see, these systems can influence public opinion, reinforce echo chambers, and pose risks to civic discourse and public trust.
This work extends two decades of research in algorithmic auditing to examine how LLMs function as news engines. We present the first audit comparing three leading agents, GPT-4o-Mini, Claude-3.7-Sonnet, and Gemini-2.0-Flash, against Google News, asking: extit{How do LLMs differ from traditional aggregators in the diversity, ideology, and reliability of the media they expose to users?}
Across 24 global topics, we find that, compared to Google News, LLMs surface significantly fewer unique outlets and allocate attention more unevenly. In the same way, GPT-4o-Mini emphasizes more factual and right-leaning sources; Claude-3.7-Sonnet favors institutional and civil-society domains and slightly amplifies right-leaning exposure; and Gemini-2.0-Flash exhibits a modest left-leaning tilt without significant changes in factuality. These patterns remain robust under prompt variations and alternative reliability benchmarks. Together, our findings show that LLMs already enact extit{agentic editorial policies}, curating information in ways that diverge from conventional aggregators. Understanding and governing their emerging editorial power will be critical for ensuring transparency, pluralism, and trust in digital information ecosystems.