🤖 AI Summary
This study addresses the problem of implicit political bias in large language models (LLMs) and its exacerbation of informational inequity in responses to socio-political issues. We construct a multidimensional benchmark of politically sensitive questions spanning both polarized and non-polarized topics, and propose a controllable evaluation framework grounded in prompt engineering—integrating political stance annotation, response consistency measurement, cross-model response clustering, and statistical significance testing. Our analysis systematically reveals, for the first time, empirical correlations between LLM political leanings and model scale, release date, and geographic origin: LLMs exhibit significant leftward bias on highly polarized issues, with bias intensity increasing monotonically with parameter count; conversely, they demonstrate high response consistency on non-polarized issues. These findings provide a reproducible methodological foundation and empirical evidence for fairness assessment and governance of LLMs.
📝 Abstract
Large Language Models (LLMs) have been widely used to generate responses on social topics due to their world knowledge and generative capabilities. Beyond reasoning and generation performance, political bias is an essential issue that warrants attention. Political bias, as a universal phenomenon in human society, may be transferred to LLMs and distort LLMs' behaviors of information acquisition and dissemination with humans, leading to unequal access among different groups of people. To prevent LLMs from reproducing and reinforcing political biases, and to encourage fairer LLM-human interactions, comprehensively examining political bias in popular LLMs becomes urgent and crucial. In this study, we systematically measure the political biases in a wide range of LLMs, using a curated set of questions addressing political bias in various contexts. Our findings reveal distinct patterns in how LLMs respond to political topics. For highly polarized topics, most LLMs exhibit a pronounced left-leaning bias. Conversely, less polarized topics elicit greater consensus, with similar response patterns across different LLMs. Additionally, we analyze how LLM characteristics, including release date, model scale, and region of origin affect political bias. The results indicate political biases evolve with model scale and release date, and are also influenced by regional factors of LLMs.