A Japanese Benchmark for Evaluating Social Bias in Reasoning Based on Attribution Theory

📅 2026-04-01

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This study addresses the limitations of existing Japanese bias evaluation benchmarks, which predominantly rely on translations from English and thus fail to capture culturally specific contexts, while also focusing solely on biased conclusions rather than biases emerging during reasoning. To bridge this gap, the authors introduce attribution theory from social psychology into the evaluation of Japanese large language models and construct a new dataset, JUBAKU-v2. This dataset comprises 216 culturally grounded contrastive examples that fix the conclusion while manipulating descriptions of in-group and out-group behaviors, thereby isolating attributional differences during the reasoning phase. Experimental results demonstrate that JUBAKU-v2 is significantly more sensitive than current benchmarks in revealing performance disparities across models with respect to Japanese sociocultural biases.

Technology Category

Application Category

📝 Abstract

In enhancing the fairness of Large Language Models (LLMs), evaluating social biases rooted in the cultural contexts of specific linguistic regions is essential. However, most existing Japanese benchmarks heavily rely on translating English data, which does not necessarily provide an evaluation suitable for Japanese culture. Furthermore, they only evaluate bias in the conclusion, failing to capture biases lurking in the reasoning. In this study, based on attribution theory in social psychology, we constructed a new dataset, ``JUBAKU-v2,'' which evaluates the bias in attributing behaviors to in-groups and out-groups within reasoning while fixing the conclusion. This dataset consists of 216 examples reflecting cultural biases specific to Japan. Experimental results verified that it can detect performance differences across models more sensitively than existing benchmarks.

Problem

Research questions and friction points this paper is trying to address.

social bias

reasoning

attribution theory

Japanese benchmark

Large Language Models

Innovation

Methods, ideas, or system contributions that make the work stand out.

social bias

attribution theory

reasoning evaluation