MyCulture: Exploring Malaysia's Diverse Culture under Low-Resource Language Constraints

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Large language models (LLMs) exhibit pervasive cultural biases due to training data skewed toward high-resource languages, limiting their capacity to accurately represent multicultural contexts in low-resource language settings. Method: We introduce MyCulture—the first Malaysian multicultural Malay-language evaluation benchmark—covering six domains: art, attire, customs, entertainment, cuisine, and religion. It employs open-ended multiple-choice questions (without predefined options), contrasts structured output with free-form generation to expose architectural biases, and incorporates multilingual prompt variants to quantify linguistic bias and cross-lingual consistency. A theoretical model validates the efficacy of open-ended formats. Results: Experiments reveal significant performance disparities among leading regional and global LLMs on MyCulture, exposing systematic deficits in low-resource cultural understanding. These findings underscore the urgent need for culturally embedded, linguistically inclusive evaluation frameworks.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) often exhibit cultural biases due to training data dominated by high-resource languages like English and Chinese. This poses challenges for accurately representing and evaluating diverse cultural contexts, particularly in low-resource language settings. To address this, we introduce MyCulture, a benchmark designed to comprehensively evaluate LLMs on Malaysian culture across six pillars: arts, attire, customs, entertainment, food, and religion presented in Bahasa Melayu. Unlike conventional benchmarks, MyCulture employs a novel open-ended multiple-choice question format without predefined options, thereby reducing guessing and mitigating format bias. We provide a theoretical justification for the effectiveness of this open-ended structure in improving both fairness and discriminative power. Furthermore, we analyze structural bias by comparing model performance on structured versus free-form outputs, and assess language bias through multilingual prompt variations. Our evaluation across a range of regional and international LLMs reveals significant disparities in cultural comprehension, highlighting the urgent need for culturally grounded and linguistically inclusive benchmarks in the development and assessment of LLMs.

Problem

Research questions and friction points this paper is trying to address.

Addressing cultural biases in LLMs for low-resource languages

Evaluating LLMs on Malaysian culture via open-ended questions

Assessing language and structural biases in multicultural benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-ended multiple-choice format reduces bias

Multilingual prompt variations assess language bias

Six cultural pillars evaluate Malaysian diversity

🔎 Similar Papers

Culturally Aware and Adapted NLP: A Taxonomy and a Survey of the State of the Art