Indic-TunedLens: Interpreting Multilingual Models in Indian Languages

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited effectiveness of existing interpretability tools for large language models in multilingual settings, particularly for morphologically complex and low-resource Indian languages. To bridge this gap, the authors propose the first interpretability framework tailored specifically for Indian languages. The approach introduces a language-adaptive shared affine transformation mechanism that performs language-specific calibration of intermediate hidden states in multilingual Transformers, enabling more accurate alignment with the output distribution of the target language. Built upon an enhanced Logit Lens methodology, the framework substantially outperforms current techniques across ten Indian languages, with notable gains in decoding faithfulness for low-resource cases. Furthermore, it provides novel insights into the hierarchical semantic encoding properties of multilingual models.

Technology Category

Application Category

📝 Abstract
Multilingual large language models (LLMs) are increasingly deployed in linguistically diverse regions like India, yet most interpretability tools remain tailored to English. Prior work reveals that LLMs often operate in English centric representation spaces, making cross lingual interpretability a pressing concern. We introduce Indic-TunedLens, a novel interpretability framework specifically for Indian languages that learns shared affine transformations. Unlike the standard Logit Lens, which directly decodes intermediate activations, Indic-TunedLens adjusts hidden states for each target language, aligning them with the target output distributions to enable more faithful decoding of model representations. We evaluate our framework on 10 Indian languages using the MMLU benchmark and find that it significantly improves over SOTA interpretability methods, especially for morphologically rich, low resource languages. Our results provide crucial insights into the layer-wise semantic encoding of multilingual transformers. Our model is available at https://huggingface.co/spaces/AnonymousAccountACL/IndicTunedLens. Our code is available at https://github.com/AnonymousAccountACL/IndicTunedLens.
Problem

Research questions and friction points this paper is trying to address.

multilingual LLMs
interpretability
Indian languages
cross-lingual
low-resource languages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Indic-TunedLens
multilingual interpretability
affine transformation
low-resource languages
logit lens
🔎 Similar Papers
No similar papers found.
M
Mihir Panchal
Dwarkadas Jivanlal Sanghvi College of Engineering, Mumbai, India
D
Deeksha Varshney
Indian Institute of Technology Jodhpur, Jodhpur, India
Mamta
Mamta
jiwaji university
environmental science
Asif Ekbal
Asif Ekbal
Department of Computer Science and Engineering, IIT Patna
Artificial IntelligenceNatural Language ProcessingMachine Learning Application