A Japanese Benchmark for Evaluating Social Bias in Reasoning Based on Attribution Theory

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limitations of existing Japanese bias evaluation benchmarks, which predominantly rely on translations from English and thus fail to capture culturally specific contexts, while also focusing solely on biased conclusions rather than biases emerging during reasoning. To bridge this gap, the authors introduce attribution theory from social psychology into the evaluation of Japanese large language models and construct a new dataset, JUBAKU-v2. This dataset comprises 216 culturally grounded contrastive examples that fix the conclusion while manipulating descriptions of in-group and out-group behaviors, thereby isolating attributional differences during the reasoning phase. Experimental results demonstrate that JUBAKU-v2 is significantly more sensitive than current benchmarks in revealing performance disparities across models with respect to Japanese sociocultural biases.
📝 Abstract
In enhancing the fairness of Large Language Models (LLMs), evaluating social biases rooted in the cultural contexts of specific linguistic regions is essential. However, most existing Japanese benchmarks heavily rely on translating English data, which does not necessarily provide an evaluation suitable for Japanese culture. Furthermore, they only evaluate bias in the conclusion, failing to capture biases lurking in the reasoning. In this study, based on attribution theory in social psychology, we constructed a new dataset, ``JUBAKU-v2,'' which evaluates the bias in attributing behaviors to in-groups and out-groups within reasoning while fixing the conclusion. This dataset consists of 216 examples reflecting cultural biases specific to Japan. Experimental results verified that it can detect performance differences across models more sensitively than existing benchmarks.
Problem

Research questions and friction points this paper is trying to address.

social bias
reasoning
attribution theory
Japanese benchmark
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

social bias
attribution theory
reasoning evaluation
Japanese benchmark
LLM fairness
🔎 Similar Papers