Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This study investigates implicit preferences and demographic biases in large language models (LLMs) when processing violent content in morally ambiguous real-world scenarios. To address this, we adapt the Violent Behavior Questionnaire (VBVQ)—a validated social science instrument—for the first time in LLM evaluation. Employing standardized zero-shot and persona-based prompting paradigms, we conduct cross-model comparative experiments across six mainstream LLMs representing diverse geopolitical backgrounds. Results reveal a significant discrepancy between surface-level outputs and latent violent inclinations; moreover, model-generated violent responses exhibit systematic variation conditioned on racially, age-, and geographically specified identities in prompts—patterns that contradict empirical criminological and sociological consensus, exposing deep-seated structural biases. Our work contributes a novel, methodologically grounded framework for LLM ethical assessment and provides a reproducible pipeline for bias detection in generative AI systems.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly proposed for detecting and responding to violent content online, yet their ability to reason about morally ambiguous, real-world scenarios remains underexamined. We present the first study to evaluate LLMs using a validated social science instrument designed to measure human response to everyday conflict, namely the Violent Behavior Vignette Questionnaire (VBVQ). To assess potential bias, we introduce persona-based prompting that varies race, age, and geographic identity within the United States. Six LLMs developed across different geopolitical and organizational contexts are evaluated under a unified zero-shot setting. Our study reveals two key findings: (1) LLMs surface-level text generation often diverges from their internal preference for violent responses; (2) their violent tendencies vary across demographics, frequently contradicting established findings in criminology, social science, and psychology.

Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' ability to handle morally ambiguous violent scenarios

Evaluating demographic biases in LLMs' violent response tendencies

Comparing LLM violence judgments against established social science findings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using VBVQ to evaluate LLM responses

Persona-based prompting for demographic bias

Zero-shot setting for unified LLM evaluation

🔎 Similar Papers

Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets