MillStone: How Open-Minded Are LLMs?

📅 2025-09-15

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Large language models (LLMs) exhibit susceptibility to external information sources when forming stances on controversial topics, raising concerns about stance manipulation and reliability. Method: We introduce MillStone, the first dedicated benchmark for evaluating stance plasticity and openness of LLMs under injected opposing-stance documents retrieved via web search and retrieval tools. We conduct controlled experiments to quantify response consistency, argumentative persuasiveness bias, and magnitude of stance shift across nine mainstream LLMs. Contribution/Results: (1) We propose and operationalize the first measurable benchmark for LLM stance openness; (2) we demonstrate that authoritative external sources exert strong directional influence on model outputs, exposing significant vulnerability to adversarial manipulation; (3) we identify substantial inter-model heterogeneity in stance responses to identical evidence. Results show that while most LLMs display limited openness, their stance stability remains highly contingent on the authority and ideological slant of input sources.

Technology Category

Application Category

📝 Abstract

Large language models equipped with Web search, information retrieval tools, and other agentic capabilities are beginning to supplant traditional search engines. As users start to rely on LLMs for information on many topics, including controversial and debatable issues, it is important to understand how the stances and opinions expressed in LLM outputs are influenced by the documents they use as their information sources. In this paper, we present MillStone, the first benchmark that aims to systematically measure the effect of external arguments on the stances that LLMs take on controversial issues (not all of them political). We apply MillStone to nine leading LLMs and measure how ``open-minded'' they are to arguments supporting opposite sides of these issues, whether different LLMs agree with each other, which arguments LLMs find most persuasive, and whether these arguments are the same for different LLMs. In general, we find that LLMs are open-minded on most issues. An authoritative source of information can easily sway an LLM's stance, highlighting the importance of source selection and the risk that LLM-based information retrieval and search systems can be manipulated.

Problem

Research questions and friction points this paper is trying to address.

Evaluating how external arguments influence LLM stances on controversial issues

Measuring LLM open-mindedness to opposing arguments across nine models

Assessing persuasiveness and manipulation risks in LLM information retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmark measures external argument effects

Tests LLM open-mindedness on controversial issues

Evaluates persuasiveness of authoritative sources

🔎 Similar Papers

Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation