π€ AI Summary
This work addresses the limited performance of large language models on multi-hop reasoning tasks involving culturally rich contexts such as Indian culture, where existing cultural evaluation benchmarks are often manually constructed, single-hop, and difficult to scale. To bridge this gap, the authors propose VIRAASATβa semi-automated framework for generating multi-hop questions grounded in a comprehensive Indian cultural knowledge graph encompassing over 700 expert-curated entities, 13 cultural attributes, and all 36 administrative regions of India. They further introduce the Symbolic Chain-of-Manipulation (SCoM) framework, which enables models to internalize atomic operations over the knowledge graph and reliably traverse its topological structure. Experiments demonstrate that SCoM improves performance over standard chain-of-thought prompting by up to 20% on VIRAASAT, substantially overcoming bottlenecks in reasoning about low-probability cultural facts. The dataset and methodology are publicly released to advance research in culturally aware reasoning.
π Abstract
Large Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich socio-cultural knowledge and diverse local contexts, particularly those involving Indian Culture. Existing Cultural benchmarks are (i) Manually crafted, (ii) contain single-hop questions testing factual recall, and (iii) prohibitively costly to scale, leaving this deficiency largely unmeasured. To address this, we introduce VIRAASAT, a novel, semi-automated multi-hop approach for generating cultural specific multi-hop Question-Answering dataset for Indian culture. VIRAASAT leverages a Knowledge Graph comprising more than 700 expert-curated cultural artifacts, covering 13 key attributes of Indian culture (history, festivals, etc). VIRAASAT spans all 28 states and 8 Union Territories, yielding more than 3,200 multi-hop questions that necessitate chained cultural reasoning. We evaluate current State-of-the-Art (SOTA) LLMs on VIRAASAT and identify key limitations in reasoning wherein fine-tuning on Chain-of-Thought(CoT) traces fails to ground and synthesize low-probability facts. To bridge this gap, we propose a novel framework named Symbolic Chain-of-Manipulation (SCoM). Adapting the Chain-of-Manipulation paradigm, we train the model to simulate atomic Knowledge Graph manipulations internally. SCoM teaches the model to reliably traverse the topological structure of the graph. Experiments on Supervised Fine-Tuning (SFT) demonstrate that SCoM outperforms standard CoT baselines by up to 20%. We release the VIRAASAT dataset along with our findings, laying a strong foundation towards building Culturally Aware Reasoning Models.