What Do Indonesians Really Need from Language Technology? A Nationwide Survey

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Indonesia’s 700+ indigenous languages remain severely underrepresented in NLP, yet the genuine technological needs of their speech communities have long been empirically uncharacterized—leading to high development costs and poor adaptability. This study presents the first nationwide, multilingual empirical survey across Indonesia’s diverse language communities, integrating structured questionnaires, stratified sampling, in-depth interviews, and quantitative demand-prioritization analysis. Results identify machine translation and information retrieval as the highest-priority applications; while public enthusiasm for AI is strong, trust remains low, with privacy, algorithmic bias, and data transparency emerging as critical governance concerns. The study proposes a “demand-driven + ethics-first” framework for localized language technology development. It fills a key empirical gap in low-resource language technology needs assessment and offers a reproducible methodological paradigm for multilingual AI governance worldwide.

Technology Category

Application Category

📝 Abstract

There is an emerging effort to develop NLP for Indonesias 700+ local languages, but progress remains costly due to the need for direct engagement with native speakers. However, it is unclear what these language communities truly need from language technology. To address this, we conduct a nationwide survey to assess the actual needs of native speakers in Indonesia. Our findings indicate that addressing language barriers, particularly through machine translation and information retrieval, is the most critical priority. Although there is strong enthusiasm for advancements in language technology, concerns around privacy, bias, and the use of public data for AI training highlight the need for greater transparency and clear communication to support broader AI adoption.

Problem

Research questions and friction points this paper is trying to address.

Assessing native speakers' needs for language technology in Indonesia

Identifying critical priorities like machine translation and information retrieval

Addressing privacy and bias concerns in AI training data usage

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nationwide survey to assess language technology needs

Focus on machine translation and information retrieval

Emphasize transparency in AI data usage

🔎 Similar Papers

Native Design Bias: Studying the Impact of English Nativeness on Language Model Performance