Lost without translation -- Can transformer (language models) understand mood states?

📅 2025-11-28

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

This study investigates the capacity of large language models (LLMs) to semantically represent affective states—depression, euthymia, euphoric mania, and irritable mania—in Indian languages, exposing critical bottlenecks in cross-lingual mental health modeling. We evaluate native multilingual embeddings (IndicBERT, Sarvam-M) on emotion classification and clustering tasks, comparing them against translation-augmented pipelines: machine translation (Gemini) and human translation followed by English-to-Chinese embedding. Native Indian-language embeddings fail to distinguish affective states (clustering score: 0.002), whereas Gemini-translated embeddings achieve markedly improved performance (0.60), and human-translated English embeddings yield the highest score (0.67). Results indicate that current multilingual LLMs lack native proficiency in interpreting non-English affective expressions; high-fidelity translation is thus essential for robust cross-lingual mental state representation. This work provides the first systematic quantification of language-specific distress expression effects on LLM affective representations, establishing a methodological benchmark and empirical foundation for equitable, generalizable multilingual mental health AI.

Technology Category

Application Category

📝 Abstract

Background: Large Language Models show promise in psychiatry but are English-centric. Their ability to understand mood states in other languages is unclear, as different languages have their own idioms of distress. Aim: To quantify the ability of language models to faithfully represent phrases (idioms of distress) of four distinct mood states (depression, euthymia, euphoric mania, dysphoric mania) expressed in Indian languages. Methods: We collected 247 unique phrases for the four mood states across 11 Indic languages. We tested seven experimental conditions, comparing k-means clustering performance on: (a) direct embeddings of native and Romanised scripts (using multilingual and Indic-specific models) and (b) embeddings of phrases translated to English and Chinese. Performance was measured using a composite score based on Adjusted Rand Index, Normalised Mutual Information, Homogeneity and Completeness. Results: Direct embedding of Indic languages failed to cluster mood states (Composite Score = 0.002). All translation-based approaches showed significant improvement. High performance was achieved using Gemini-translated English (Composite=0.60) and human-translated English (Composite=0.61) embedded with gemini-001. Surprisingly, human-translated English, further translated into Chinese and embedded with a Chinese model, performed best (Composite = 0.67). Specialised Indic models (IndicBERT and Sarvam-M) performed poorly. Conclusion: Current models cannot meaningfully represent mood states directly from Indic languages, posing a fundamental barrier to their psychiatric application for diagnostic or therapeutic purposes in India. While high-quality translation bridges this gap, reliance on proprietary models or complex translation pipelines is unsustainable. Models must first be built to understand diverse local languages to be effective in global mental health.

Problem

Research questions and friction points this paper is trying to address.

Evaluates language models' ability to understand mood states in non-English languages

Tests if models can cluster mood phrases from 11 Indic languages accurately

Identifies translation as a necessary but unsustainable solution for psychiatric applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using translation to English/Chinese before embedding analysis

Comparing multilingual versus language-specific model clustering performance

Employing composite metrics to evaluate mood state classification accuracy

🔎 Similar Papers

No similar papers found.