🤖 AI Summary
Multilingual question-answering (QA) systems face dual challenges of objective factual consistency and subjective cultural adaptability—particularly critical in sensitive domains such as maternal and child health. To address this, we propose MIND, the first framework to integrate user-coordinated verification into multilingual factual consistency detection, explicitly distinguishing factual errors from culturally appropriate variations. Our method combines multilingual NLP techniques, context-aware answer comparison, culture-sensitive annotation, and bilingual human validation. Leveraging this pipeline, we construct the first publicly available bilingual dataset annotated for fact–culture inconsistency. Experiments demonstrate that MIND effectively identifies cross-lingual divergent answers across multiple domains, exhibits strong cross-domain generalization, and provides a scalable technical pathway—along with benchmark resources—for developing culturally aware, factually reliable multilingual QA systems.
📝 Abstract
Multilingual question answering (QA) systems must ensure factual consistency across languages, especially for objective queries such as What is jaundice?, while also accounting for cultural variation in subjective responses. We propose MIND, a user-in-the-loop fact-checking pipeline to detect factual and cultural discrepancies in multilingual QA knowledge bases. MIND highlights divergent answers to culturally sensitive questions (e.g., Who assists in childbirth?) that vary by region and context. We evaluate MIND on a bilingual QA system in the maternal and infant health domain and release a dataset of bilingual questions annotated for factual and cultural inconsistencies. We further test MIND on datasets from other domains to assess generalization. In all cases, MIND reliably identifies inconsistencies, supporting the development of more culturally aware and factually consistent QA systems.