🤖 AI Summary
This study addresses the poor performance of existing medical large language models in low-resource languages such as Hindi, particularly their limited capacity for reasoning about Indian traditional medicine. To bridge this gap, the authors introduce the first Hindi medical reasoning corpus and evaluation benchmark encompassing both Western and Indian medical knowledge. They propose HiMed-8B, a novel model that incorporates a decaying scaffold reward mechanism to enhance cross-lingual transfer of medical knowledge. Experimental results demonstrate that this approach significantly improves Hindi medical reasoning performance, effectively narrowing the accuracy gap between English and Hindi. Ablation studies further confirm the contribution of each component to the overall effectiveness of the proposed framework.
📝 Abstract
Medical large language models hold promise for reducing healthcare disparities, yet Hindi remains severely underrepresented. While medical LLMs excel in high-resource languages, their performance degrades sharply in Hindi, particularly on Indian systems of medicine. We argue that robust cross-lingual medical transfer requires Hindi reasoning. To this end, we introduce HiMed, a Hindi reasoning medical corpus and benchmark suite covering both Western and Indian medicine. We further propose HiMed-8B, a Hindi-form medical reasoning LLM, through the design of decaying scaffolding reward. Extensive experiments demonstrate improvement in Hindi medical reasoning performance and reduction in the English--Hindi accuracy gap. Ablation studies validate the contribution of each training stage and reward component. All data and code are available on GitHub: https://github.com/FreedomIntelligence/HiMed.