IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

The absence of culturally grounded safety evaluation frameworks for large language models (LLMs) in Indonesia’s multilingual, multicultural context hinders responsible deployment. Method: We introduce the first high-quality, human-verified, localization-aware safety benchmark—covering formal and colloquial Indonesian, Javanese, Sundanese, and Minangkabau—underpinned by a novel, socioculturally aligned safety taxonomy. Our methodology integrates a culture-aware evaluation framework, human-in-the-loop annotation, and culturally adapted supervised fine-tuning (SFT) with safety alignment training. Contribution/Results: Experimental results demonstrate significant improvements in model safety responsiveness across colloquial and regional language settings, without compromising general task performance. This work establishes the first multilingual, multi-domain, cross-lingually expert-validated safety benchmark for Indonesia and provides a reusable, open-source methodology for culturally situated LLM safety assessment and deployment.

Technology Category

Application Category

📝 Abstract

Although region-specific large language models (LLMs) are increasingly developed, their safety remains underexplored, particularly in culturally diverse settings like Indonesia, where sensitivity to local norms is essential and highly valued by the community. In this work, we present IndoSafety, the first high-quality, human-verified safety evaluation dataset tailored for the Indonesian context, covering five language varieties: formal and colloquial Indonesian, along with three major local languages: Javanese, Sundanese, and Minangkabau. IndoSafety is constructed by extending prior safety frameworks to develop a taxonomy that captures Indonesia's sociocultural context. We find that existing Indonesian-centric LLMs often generate unsafe outputs, particularly in colloquial and local language settings, while fine-tuning on IndoSafety significantly improves safety while preserving task performance. Our work highlights the critical need for culturally grounded safety evaluation and provides a concrete step toward responsible LLM deployment in multilingual settings. Warning: This paper contains example data that may be offensive, harmful, or biased.

Problem

Research questions and friction points this paper is trying to address.

Ensuring LLM safety in culturally diverse Indonesian languages

Developing a culturally grounded safety evaluation dataset for Indonesia

Improving safety in colloquial and local language LLM outputs

Innovation

Methods, ideas, or system contributions that make the work stand out.

First Indonesian safety dataset for LLMs

Extends safety frameworks for local cultures

Fine-tuning improves safety and performance

🔎 Similar Papers

No similar papers found.