Assessing Historical Structural Oppression Worldwide via Rule-Guided Prompting of Large Language Models

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Existing measures of historical structural oppression suffer from nation-specificity, material-resource bias, and insufficient attention to identity-based social exclusion—hindering cross-national comparability. This paper introduces a rule-guided, large language model (LLM) framework that generates context-sensitive, interpretable identity-based historical disadvantage scores from multilingual self-reported ethnic narratives. Our approach integrates theory-driven prompt engineering, cross-cultural semantic modeling, and systematic evaluation to quantify historical oppression embedded in unstructured textual accounts. Empirical validation demonstrates robust cross-national pattern detection and substantially improves upon conventional metrics in empirical grounding, cultural adaptability, and interpretability. We publicly release a benchmark dataset and analytical toolkit, establishing the first reproducible, scalable paradigm for measuring systemic exclusion—advancing research in public health and social inequality.

Technology Category

Application Category

📝 Abstract

Traditional efforts to measure historical structural oppression struggle with cross-national validity due to the unique, locally specified histories of exclusion, colonization, and social status in each country, and often have relied on structured indices that privilege material resources while overlooking lived, identity-based exclusion. We introduce a novel framework for oppression measurement that leverages Large Language Models (LLMs) to generate context-sensitive scores of lived historical disadvantage across diverse geopolitical settings. Using unstructured self-identified ethnicity utterances from a multilingual COVID-19 global study, we design rule-guided prompting strategies that encourage models to produce interpretable, theoretically grounded estimations of oppression. We systematically evaluate these strategies across multiple state-of-the-art LLMs. Our results demonstrate that LLMs, when guided by explicit rules, can capture nuanced forms of identity-based historical oppression within nations. This approach provides a complementary measurement tool that highlights dimensions of systemic exclusion, offering a scalable, cross-cultural lens for understanding how oppression manifests in data-driven research and public health contexts. To support reproducible evaluation, we release an open-sourced benchmark dataset for assessing LLMs on oppression measurement (https://github.com/chattergpt/llm-oppression-benchmark).

Problem

Research questions and friction points this paper is trying to address.

Measuring historical structural oppression cross-nationally with validity

Overcoming limitations of material-focused indices via context-sensitive LLM scoring

Generating interpretable estimates of identity-based exclusion across diverse settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rule-guided prompting of LLMs

Context-sensitive oppression scores generation

Open-sourced benchmark dataset release

🔎 Similar Papers

Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora