🤖 AI Summary
This work systematically evaluates geopolitical narrative biases in large language models (LLMs) across pivotal historical events involving the U.S., U.K., USSR, and China. To address the lack of standardized benchmarks for cross-national adversarial perspectives, we introduce the first geographically diverse, multi-perspective benchmark dataset comprising neutral event descriptions and conflicting national narratives. We further propose an attribution-manipulation–based sensitivity analysis framework that quantifies model preferences via multi-perspective prompting and label perturbation. Experimental results reveal significant, asymmetric national preference biases in LLMs; high sensitivity to attribution-label tampering; and weak consistency in identifying biased narratives. Moreover, conventional prompt-engineering debiasing techniques prove largely ineffective. Our study establishes the first measurable, reproducible evaluation protocol for geopolitical bias in LLMs. The open-sourced dataset and evaluation framework provide foundational infrastructure for future fairness and alignment research.
📝 Abstract
This paper evaluates geopolitical biases in LLMs with respect to various countries though an analysis of their interpretation of historical events with conflicting national perspectives (USA, UK, USSR, and China). We introduce a novel dataset with neutral event descriptions and contrasting viewpoints from different countries. Our findings show significant geopolitical biases, with models favoring specific national narratives. Additionally, simple debiasing prompts had a limited effect in reducing these biases. Experiments with manipulated participant labels reveal models' sensitivity to attribution, sometimes amplifying biases or recognizing inconsistencies, especially with swapped labels. This work highlights national narrative biases in LLMs, challenges the effectiveness of simple debiasing methods, and offers a framework and dataset for future geopolitical bias research.