Whose Facts Win? LLM Source Preferences under Knowledge Conflicts

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

160K/year

🤖 AI Summary

This study investigates how large language models (LLMs) resolve knowledge conflicts between institutional sources—such as government or news outlets—and individual or social media sources, with a focus on the influence of source credibility and repeated information. Drawing on interdisciplinary research on credibility, we develop a systematic evaluation framework to analyze the preference behaviors of 13 open-source LLMs. Our work reveals, for the first time, that while models generally favor institutional sources, this preference can be substantially reversed by repeated exposure to conflicting information from less credible sources. Building on this insight, we propose a novel mitigation method that reduces repetition-induced bias by up to 99.8% while preserving at least 88.8% of the models’ original source preferences.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) are more frequently used in retrieval-augmented generation pipelines, it is increasingly relevant to study their behavior under knowledge conflicts. Thus far, the role of the source of the retrieved information has gone unexamined. We address this gap with a novel framework to investigate how source preferences affect LLM resolution of inter-context knowledge conflicts in English, motivated by interdisciplinary research on credibility. With a comprehensive, tightly-controlled evaluation of 13 open-weight LLMs, we find that LLMs prefer institutionally-corroborated information (e.g., government or newspaper sources) over information from people and social media. However, these source preferences can be reversed by simply repeating information from less credible sources. To mitigate repetition effects and maintain consistent preferences, we propose a novel method that reduces repetition bias by up to 99.8%, while also maintaining at least 88.8% of original preferences. We release all data and code to encourage future work on credibility and source preferences in knowledge-intensive NLP.

Problem

Research questions and friction points this paper is trying to address.

large language models

source preference

knowledge conflict

credibility

retrieval-augmented generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

source preference

knowledge conflict

repetition bias