🤖 AI Summary
Indic languages—characterized by rich morphology, complex syntax, and strong contextual dependency—pose significant challenges for low-resource question answering (QA). To address this, we introduce state space models (SSMs) to Indic-language QA for the first time, proposing an SSM-based architecture specifically enhanced for low-resource multilingual settings. Our approach integrates cross-lingual transfer learning, Indic-specific preprocessing, and a context alignment mechanism to improve semantic understanding and grounding. Evaluated across multiple Indic-language QA benchmarks, our method achieves substantial gains in question understanding, context matching, and answer generation, outperforming strong baselines by an average of 8.2%. This work not only demonstrates the efficacy of SSMs in modeling morphologically complex languages but also establishes the first dedicated SSM benchmark for Indic QA. By providing both empirical validation and reusable technical components, our study advances low-resource multilingual NLP with a novel architectural paradigm and practical implementation pathway.
📝 Abstract
The diversity and complexity of Indic languages present unique challenges for natural language processing (NLP) tasks, particularly in the domain of question answering (QA).To address these challenges, this paper explores the application of State Space Models (SSMs),to build efficient and contextually aware QA systems tailored for Indic languages. SSMs are particularly suited for this task due to their ability to model long-term and short-term dependencies in sequential data, making them well-equipped to handle the rich morphology, complex syntax, and contextual intricacies characteristic of Indian languages. We evaluated multiple SSM architectures across diverse datasets representing various Indic languages and conducted a comparative analysis of their performance. Our results demonstrate that these models effectively capture linguistic subtleties, leading to significant improvements in question interpretation, context alignment, and answer generation. This work represents the first application of SSMs to question answering tasks in Indic languages, establishing a foundational benchmark for future research in this domain. We propose enhancements to existing SSM frameworks, optimizing their applicability to low-resource settings and multilingual scenarios prevalent in Indic languages.