Afri-MCQA: Multimodal Cultural Question Answering for African Languages

๐Ÿ“… 2026-01-09
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the critical underrepresentation of African languages in artificial intelligence, particularly in multimodal cultural question answering. To bridge this gap, the authors introduce the first multimodal cultural QA benchmark covering 15 African languages, comprising 7.5k image-text and speech-based question-answer pairs, all created by native speakers and accompanied by Englishโ€“African parallel data. The benchmark emphasizes speech-first interaction, culturally grounded pretraining, and cross-lingual cultural transfer. Experimental results reveal that current open-source large language models achieve near-zero accuracy on this task, underscoring a profound dual gap in both linguistic coverage and cultural understanding. This resource thus provides a vital foundation and clear direction for developing more inclusive and culturally aware AI systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Africa is home to over one-third of the world's languages, yet remains underrepresented in AI research. We introduce Afri-MCQA, the first Multilingual Cultural Question-Answering benchmark covering 7.5k Q&A pairs across 15 African languages from 12 countries. The benchmark offers parallel English-African language Q&A pairs across text and speech modalities and was entirely created by native speakers. Benchmarking large language models (LLMs) on Afri-MCQA shows that open-weight models perform poorly across evaluated cultures, with near-zero accuracy on open-ended VQA when queried in native language or speech. To evaluate linguistic competence, we include control experiments meant to assess this specific aspect separate from cultural knowledge, and we observe significant performance gaps between native languages and English for both text and speech. These findings underscore the need for speech-first approaches, culturally grounded pretraining, and cross-lingual cultural transfer. To support more inclusive multimodal AI development in African languages, we release our Afri-MCQA under academic license or CC BY-NC 4.0 on HuggingFace (https://huggingface.co/datasets/Atnafu/Afri-MCQA)
Problem

Research questions and friction points this paper is trying to address.

African languages
multimodal question answering
cultural representation
speech modality
linguistic competence
Innovation

Methods, ideas, or system contributions that make the work stand out.

multimodal
African languages
cultural question answering
speech-first
cross-lingual transfer
๐Ÿ”Ž Similar Papers
No similar papers found.
Atnafu Lambebo Tonja
Atnafu Lambebo Tonja
Postdoc at MBZUAI
NLP for low-resource languagesMultilingual language modelsSpeech Technology
Srija Anand
Srija Anand
MS by Research, AI4Bharat, IIT Madras
Speech SynthesisNatural Language ProcessingLLM Evaluation
E
Emilio Villa-Cueva
MBZUAI
Israel Abebe Azime
Israel Abebe Azime
Saarland University
NLP | Multimodal learning | Deep Learning Applications
J
Jesujoba O. Alabi
Saarland University
Muhidin A. Mohamed
Muhidin A. Mohamed
Aston University
Natural Language processingData ScienceInformation RetrievalArtificial IntelligenceCommunication Networks
D
Debela Desalegn Yadeta
Addis Ababa University
N
Negasi Haile Abadi
Lesan AI
A
Abigail Oppong
Independent
N
Nnaemeka Casmir Obiefuna
Friedrich-Alexander University
Idris Abdulmumin
Idris Abdulmumin
Postdoctoral Fellow, DSFSI, University of Pretoria
Machine TranslationNeural Machine TranslationNatural Language ProcessingInternet Technology
Naome A. Etori
Naome A. Etori
Department of Computer Science and Engineering, University of Minnesota-Twin Cities
AINLPHealthcareHCIComputational Social Science
E
Eric Peter Wairagala
Lelapa AI
K
Kanda Patrick Tshinu
Tshwane University of Technology
I
Imanigirimbabazi Emmanuel
Kabale University
G
Gabofetswe Malema
University of Botswana
Alham Fikri Aji
Alham Fikri Aji
MBZUAI, Monash Indonesia
MultilingualityLow-resource NLPLanguage ModelingMachine Translation
David Ifeoluwa Adelani
David Ifeoluwa Adelani
McGill University and Mila - Quebec AI Institute and Canada CIFAR AI Chair
Natural language processingMultilingualityMultilingual NLPAfricaNLPLow-resource NLP
Thamar Solorio
Thamar Solorio
MBZUAI & University of Houston
Natural Language Processing