Analyzing Social Biases in Japanese Large Language Models

📅 2024-06-04
🏛️ arXiv.org
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
Systematic evaluation of social bias in Japanese large language models (LLMs) remains lacking. Method: We introduce JBBQ (Japanese BBQ), the first benchmark for assessing social bias in Japanese LLMs, cross-lingually adapted from the English BBQ dataset and covering dimensions including gender, occupation, and geography. We conduct multi-model comparative evaluation and quantify bias using a standardized scoring metric. Contribution/Results: Our analysis reveals, for the first time, a positive correlation between parameter count and bias severity in Japanese LLMs. We further demonstrate that mitigation strategies—specifically warning prompts and chain-of-thought (CoT) prompting—significantly reduce bias (average reduction of 12.3%), though they do not eliminate it entirely. JBBQ establishes a standardized, publicly available tool for fairness research on Japanese LLMs, addressing a critical gap in social bias evaluation for non-English foundation models.

Technology Category

Application Category

📝 Abstract
With the development of Large Language Models (LLMs), social biases in the LLMs have become a crucial issue. While various benchmarks for social biases have been provided across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ, and analyze social biases in Japanese LLMs. The results show that while current open Japanese LLMs improve their accuracies on JBBQ by setting larger parameters, their bias scores become larger. In addition, prompts with warnings about social biases and Chain-of-Thought prompting reduce the effect of biases in model outputs, but there is room for improvement in the consistency of reasoning.
Problem

Research questions and friction points this paper is trying to address.

Assessing social biases in Japanese large language models
Evaluating bias mitigation effectiveness in Japanese LLM prompts
Developing a Japanese benchmark for bias analysis in LLMs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructed Japanese Bias Benchmark dataset JBBQ
Analyzed social biases in Japanese LLMs
Used warning prompts and chain-of-thought techniques
🔎 Similar Papers
No similar papers found.
Hitomi Yanaka
Hitomi Yanaka
The University of Tokyo, RIKEN
Natural Language ProcessingSemantics
N
Namgi Han
the University of Tokyo
R
Ryoma Kumon
the University of Tokyo
J
Jie Lu
the University of Tokyo
M
Masashi Takeshita
Hokkaido University
R
Ryo Sekizawa
the University of Tokyo
T
Taisei Kato
H
Hiromi Arai
Riken