FiMI: A Domain-Specific Language Model for Indian Finance Ecosystem

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing large language models in the Indian financial ecosystem—particularly in domains such as digital payments, transaction disputes, and authorization management—and the absence of multilingual models supporting English, Hindi, and Hinglish. To bridge this gap, the authors present FiMI, the first large multilingual model tailored for Indian financial scenarios, built upon the Mistral Small 24B architecture. FiMI is developed through continued pretraining on 68 billion tokens of multilingual financial and synthetic data, followed by instruction tuning for multi-turn tool-augmented dialogues and domain-specific supervised fine-tuning. Experimental results demonstrate that FiMI Base outperforms baseline models by 20% on financial reasoning benchmarks, while FiMI Instruct achieves an 87% improvement in tool-calling tasks, all while maintaining competitive general-purpose capabilities.

Technology Category

Application Category

📝 Abstract
We present FiMI (Finance Model for India), a domain-specialized financial language model developed for Indian digital payment systems. We develop two model variants: FiMI Base and FiMI Instruct. FiMI adapts the Mistral Small 24B architecture through a multi-stage training pipeline, beginning with continuous pre-training on 68 Billion tokens of curated financial, multilingual (English, Hindi, Hinglish), and synthetic data. This is followed by instruction fine-tuning and domain-specific supervised fine-tuning focused on multi-turn, tool-driven conversations that model real-world workflows, such as transaction disputes and mandate lifecycle management. Evaluations reveal that FiMI Base achieves a 20% improvement over the Mistral Small 24B Base model on finance reasoning benchmark, while FiMI Instruct outperforms the Mistral Small 24B Instruct model by 87% on domain-specific tool-calling. Moreover, FiMI achieves these significant domain gains while maintaining comparable performance to models of similar size on general benchmarks.
Problem

Research questions and friction points this paper is trying to address.

domain-specific language model
Indian finance ecosystem
digital payment systems
tool-calling
financial reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

domain-specific language model
financial NLP
instruction fine-tuning
multilingual finance data
tool-augmented reasoning
🔎 Similar Papers
No similar papers found.
A
Aboli Kathar
A
Aman Kumar
A
Anusha Kamath
A
A. Srujan
A
Ashish Sharma
C
Chandra Bhushan
D
Dilip Asbe
D
Divya Sorate
D
Duddu Prasanth Kumar
E
Evan Acharya
H
Harsh Sharma
H
Hrithik Kadam
K
Kanishk Singla
K
Keyur Doshi
Kiran Praveen
Kiran Praveen
Chief Engineer, Samsung R&D Institute Bangalore
Machine LearningDeep LearningSpeech Recognition
S
SK KolisettyKrishna
K
Krishanu Adhikary
L
Lokesh Mpt
M
Mayurdeep Sonowal
N
Nadeem Shaikh
N
Navya Prakash
N
Nimit Kothari
N
Nitin Kukreja
P
Prashant Devadiga
Rakesh Paul
Rakesh Paul
Senior Deep Learning Scientist, NVIDIA
Multilingual NLPLLMModel OptimisationLLM Safety
R
Ratanjeet Pratap Chauhan
R
Raunak Kalani
Raviraj Joshi
Raviraj Joshi
Indian Institute of Technology Madras
computer sciencemachine learningnatural language processing
M
M. Shamanth
S
Shantanu Pandey
S
Shubham Soni
S
Siddharth Dixit
S
Smriti Jopat
S
Sunil Patel
S
Suraj Singh
S
Suvradip Paul
T
Tulasi Pilla
U
Utkarsh Vaidya
V
Vineeth Nambiar
V
Vishal Kanvaty
Y
Yatharth Dedhia