BanStereoSet: A Dataset to Measure Stereotypical Social Biases in LLMs for Bangla

📅 2024-09-18
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic evaluation of social bias in multilingual large language models (LLMs) for Bengali, covering nine dimensions: race, gender, occupation, age, physical attractiveness, geography, caste, religion, and “attractiveness-in-occupation.” To address the scarcity of non-English bias evaluation resources, we introduce BanglaStereo—the first stereoscopic social bias benchmark for Bengali—comprising 1,194 expert-validated, culturally adapted, and manually translated sentences. We innovatively localize mainstream bias evaluation frameworks (e.g., StereoSet) to the South Asian context by incorporating region-specific bias categories, including caste, regional identity, and religion. Empirical evaluations across multiple multilingual LLMs reveal statistically significant biases across all dimensions. Our work establishes a reproducible, quantitative benchmark for bias measurement in Bengali NLP, enabling rigorous fairness assessment and facilitating equitable AI deployment in Bengali-speaking communities.

Technology Category

Application Category

📝 Abstract
This study presents BanStereoSet, a dataset designed to evaluate stereotypical social biases in multilingual LLMs for the Bangla language. In an effort to extend the focus of bias research beyond English-centric datasets, we have localized the content from the StereoSet, IndiBias, and Kamruzzaman et. al.'s datasets, producing a resource tailored to capture biases prevalent within the Bangla-speaking community. Our BanStereoSet dataset consists of 1,194 sentences spanning 9 categories of bias: race, profession, gender, ageism, beauty, beauty in profession, region, caste, and religion. This dataset not only serves as a crucial tool for measuring bias in multilingual LLMs but also facilitates the exploration of stereotypical bias across different social categories, potentially guiding the development of more equitable language technologies in Bangladeshi contexts. Our analysis of several language models using this dataset indicates significant biases, reinforcing the necessity for culturally and linguistically adapted datasets to develop more equitable language technologies.
Problem

Research questions and friction points this paper is trying to address.

Measure social biases in Bangla multilingual LLMs
Extend bias research beyond English-centric datasets
Evaluate biases across 9 categories in Bangla
Innovation

Methods, ideas, or system contributions that make the work stand out.

Localized datasets for Bangla bias evaluation
Covers 9 social bias categories comprehensively
Analyzes multilingual LLMs for cultural biases
🔎 Similar Papers
No similar papers found.
M
M. Kamruzzaman
University of South Florida
A
Abdullah Al Monsur
University of South Florida
S
Shrabon Das
University of South Florida
Enamul Hassan
Enamul Hassan
PhD Student at CS, SBU | Assistant Professor at CSE, SUST
Large Language ModelNatural Language ProcessingMachine Learning
Gene Louis Kim
Gene Louis Kim
University of South Florida