Afri-MCQA: Multimodal Cultural Question Answering for African Languages

📅 2026-01-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the critical underrepresentation of African languages in artificial intelligence, particularly in multimodal cultural question answering. To bridge this gap, the authors introduce the first multimodal cultural QA benchmark covering 15 African languages, comprising 7.5k image-text and speech-based question-answer pairs, all created by native speakers and accompanied by English–African parallel data. The benchmark emphasizes speech-first interaction, culturally grounded pretraining, and cross-lingual cultural transfer. Experimental results reveal that current open-source large language models achieve near-zero accuracy on this task, underscoring a profound dual gap in both linguistic coverage and cultural understanding. This resource thus provides a vital foundation and clear direction for developing more inclusive and culturally aware AI systems.

Technology Category

Application Category

📝 Abstract

Africa is home to over one-third of the world's languages, yet remains underrepresented in AI research. We introduce Afri-MCQA, the first Multilingual Cultural Question-Answering benchmark covering 7.5k Q&A pairs across 15 African languages from 12 countries. The benchmark offers parallel English-African language Q&A pairs across text and speech modalities and was entirely created by native speakers. Benchmarking large language models (LLMs) on Afri-MCQA shows that open-weight models perform poorly across evaluated cultures, with near-zero accuracy on open-ended VQA when queried in native language or speech. To evaluate linguistic competence, we include control experiments meant to assess this specific aspect separate from cultural knowledge, and we observe significant performance gaps between native languages and English for both text and speech. These findings underscore the need for speech-first approaches, culturally grounded pretraining, and cross-lingual cultural transfer. To support more inclusive multimodal AI development in African languages, we release our Afri-MCQA under academic license or CC BY-NC 4.0 on HuggingFace (https://huggingface.co/datasets/Atnafu/Afri-MCQA)

Problem

Research questions and friction points this paper is trying to address.

African languages

multimodal question answering

cultural representation

speech modality

linguistic competence

Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal

African languages

cultural question answering

speech-first

cross-lingual transfer

🔎 Similar Papers

CaLMQA: Exploring culturally specific long-form question answering across 23 languages

2024-06-25arXiv.orgCitations: 13