From Monolingual to Bilingual: Investigating Language Conditioning in Large Language Models for Psycholinguistic Tasks

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This study investigates whether large language models (LLMs) possess human-like cross-linguistic psycholinguistic representations. Method: We employed monolingual and bilingual language-identity prompts (English, Dutch, Chinese) alongside behavioral symbolic association and lexical affect tasks, complemented by representational probing of internal states in Llama-3.3-70B-Instruct and Qwen2.5-72B-Instruct. Contribution/Results: Language identity significantly modulates model responses: Chinese prompts elicit stronger and more stable affective representations, and Qwen demonstrates higher language sensitivity. Language-specific signals are more decodable in deeper transformer layers. This work provides the first systematic empirical validation that LLMs dynamically adapt their psycholinguistic responses according to language identity—revealing an intrinsic, context-sensitive representational mechanism. These findings establish a critical foundation for cross-linguistic cognitive modeling grounded in LLMs, advancing our understanding of how linguistic context shapes semantic and affective processing in artificial systems.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) exhibit strong linguistic capabilities, but little is known about how they encode psycholinguistic knowledge across languages. We investigate whether and how LLMs exhibit human-like psycholinguistic responses under different linguistic identities using two tasks: sound symbolism and word valence. We evaluate two models, Llama-3.3-70B-Instruct and Qwen2.5-72B-Instruct, under monolingual and bilingual prompting in English, Dutch, and Chinese. Behaviorally, both models adjust their outputs based on prompted language identity, with Qwen showing greater sensitivity and sharper distinctions between Dutch and Chinese. Probing analysis reveals that psycholinguistic signals become more decodable in deeper layers, with Chinese prompts yielding stronger and more stable valence representations than Dutch. Our results demonstrate that language identity conditions both output behavior and internal representations in LLMs, providing new insights into their application as models of cross-linguistic cognition.

Problem

Research questions and friction points this paper is trying to address.

How LLMs encode psycholinguistic knowledge across languages

Whether LLMs show human-like responses in sound symbolism and word valence

How language identity affects LLM output behavior and internal representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bilingual prompting in psycholinguistic tasks

Layer-depth analysis of psycholinguistic signals

Language identity conditions model outputs

🔎 Similar Papers

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models