Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing measures of historical structural oppression suffer from nation-specificity, material-resource bias, and insufficient attention to identity-based social exclusion—hindering cross-national comparability. This paper introduces a rule-guided, large language model (LLM) framework that generates context-sensitive, interpretable identity-based historical disadvantage scores from multilingual self-reported ethnic narratives. Our approach integrates theory-driven prompt engineering, cross-cultural semantic modeling, and systematic evaluation to quantify historical oppression embedded in unstructured textual accounts. Empirical validation demonstrates robust cross-national pattern detection and substantially improves upon conventional metrics in empirical grounding, cultural adaptability, and interpretability. We publicly release a benchmark dataset and analytical toolkit, establishing the first reproducible, scalable paradigm for measuring systemic exclusion—advancing research in public health and social inequality.

Technology Category

Application Category

📝 Abstract
Traditional efforts to measure historical structural oppression struggle with cross-national validity due to the unique, locally specified histories of exclusion, colonization, and social status in each country, and often have relied on structured indices that privilege material resources while overlooking lived, identity-based exclusion. We introduce a novel framework for oppression measurement that leverages Large Language Models (LLMs) to generate context-sensitive scores of lived historical disadvantage across diverse geopolitical settings. Using unstructured self-identified ethnicity utterances from a multilingual COVID-19 global study, we design rule-guided prompting strategies that encourage models to produce interpretable, theoretically grounded estimations of oppression. We systematically evaluate these strategies across multiple state-of-the-art LLMs. Our results demonstrate that LLMs, when guided by explicit rules, can capture nuanced forms of identity-based historical oppression within nations. This approach provides a complementary measurement tool that highlights dimensions of systemic exclusion, offering a scalable, cross-cultural lens for understanding how oppression manifests in data-driven research and public health contexts. To support reproducible evaluation, we release an open-sourced benchmark dataset for assessing LLMs on oppression measurement (https://github.com/chattergpt/llm-oppression-benchmark).
Problem

Research questions and friction points this paper is trying to address.

Measuring historical structural oppression cross-nationally with validity
Overcoming limitations of material-focused indices via context-sensitive LLM scoring
Generating interpretable estimates of identity-based exclusion across diverse settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rule-guided prompting of LLMs
Context-sensitive oppression scores generation
Open-sourced benchmark dataset release
🔎 Similar Papers
No similar papers found.
S
Sreejato Chatterjee
Goergen Institute for Data Science, University of Rochester
L
Linh Tran
Department of Computer Science, University of Rochester
Q
Quoc Duy Nguyen
Department of Economics, University of Rochester
R
Roni Kirson
College of Arts and Sciences, University of Rochester
D
Drue Hamlin
Department of Sociology and Anthropology, Rochester Institute of Technology
H
Harvest Aquino
Department of Public Health, University of Rochester
Hanjia Lyu
Hanjia Lyu
University of Rochester
AI and SocietyMultimodal LLMsGraph LearningComputational Social ScienceHealth Informatics
J
Jiebo Luo
Department of Computer Science, University of Rochester
T
Timothy Dye
Department of Obstetrics and Gynecology, University of Rochester School of Medicine