Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates implicit preferences and demographic biases in large language models (LLMs) when processing violent content in morally ambiguous real-world scenarios. To address this, we adapt the Violent Behavior Questionnaire (VBVQ)—a validated social science instrument—for the first time in LLM evaluation. Employing standardized zero-shot and persona-based prompting paradigms, we conduct cross-model comparative experiments across six mainstream LLMs representing diverse geopolitical backgrounds. Results reveal a significant discrepancy between surface-level outputs and latent violent inclinations; moreover, model-generated violent responses exhibit systematic variation conditioned on racially, age-, and geographically specified identities in prompts—patterns that contradict empirical criminological and sociological consensus, exposing deep-seated structural biases. Our work contributes a novel, methodologically grounded framework for LLM ethical assessment and provides a reproducible pipeline for bias detection in generative AI systems.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are increasingly proposed for detecting and responding to violent content online, yet their ability to reason about morally ambiguous, real-world scenarios remains underexamined. We present the first study to evaluate LLMs using a validated social science instrument designed to measure human response to everyday conflict, namely the Violent Behavior Vignette Questionnaire (VBVQ). To assess potential bias, we introduce persona-based prompting that varies race, age, and geographic identity within the United States. Six LLMs developed across different geopolitical and organizational contexts are evaluated under a unified zero-shot setting. Our study reveals two key findings: (1) LLMs surface-level text generation often diverges from their internal preference for violent responses; (2) their violent tendencies vary across demographics, frequently contradicting established findings in criminology, social science, and psychology.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' ability to handle morally ambiguous violent scenarios
Evaluating demographic biases in LLMs' violent response tendencies
Comparing LLM violence judgments against established social science findings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using VBVQ to evaluate LLM responses
Persona-based prompting for demographic bias
Zero-shot setting for unified LLM evaluation
🔎 Similar Papers
No similar papers found.