Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases

📅 2025-02-25

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This study investigates the origin of political bias in large language models (LLMs) regarding 32 contentious U.S. Supreme Court cases (e.g., abortion, voting rights): whether it stems from inherent biases in training corpora or reflects human-elicited public consensus. We propose a corpus-level quantification method for political bias, integrating case-specific political polarity annotations, statistical modeling of corpus-level political distributions, cross-source correlation analysis, and systematic comparison with multiple nationally representative public opinion surveys. Our key empirical finding—first of its kind—is that LLMs’ political judgments exhibit a statistically significant positive correlation with the political distribution of their training corpora (p < 0.01), yet show no statistically significant association with survey-based public opinion. These results indicate that LLMs’ political leanings are predominantly endogenous to pretraining data rather than aligned with human value consensus, underscoring the foundational role of data governance in value alignment.

Technology Category

Application Category

📝 Abstract

The increased adoption of Large Language Models (LLMs) and their potential to shape public opinion have sparked interest in assessing these models' political leanings. Building on previous research that compared LLMs and human opinions and observed political bias in system responses, we take a step further to investigate the underlying causes of such biases by empirically examining how the values and biases embedded in training corpora shape model outputs. Specifically, we propose a method to quantitatively evaluate political leanings embedded in the large pretraining corpora. Subsequently we investigate to whom are the LLMs' political leanings more aligned with, their pretrainig corpora or the surveyed human opinions. As a case study, we focus on probing the political leanings of LLMs in 32 U.S. Supreme Court cases, addressing contentious topics such as abortion and voting rights. Our findings reveal that LLMs strongly reflect the political leanings in their training data, and no strong correlation is observed with their alignment to human opinions as expressed in surveys. These results underscore the importance of responsible curation of training data and the need for robust evaluation metrics to ensure LLMs' alignment with human-centered values.

Problem

Research questions and friction points this paper is trying to address.

Assess LLMs' political leanings

Compare alignment with training data vs. human opinions

Examine biases in U.S. Supreme Court cases

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantitative evaluation of political leanings

Comparative alignment with training corpora

Case study on U.S. Supreme Court cases

🔎 Similar Papers

No similar papers found.