From Stimuli to Minds: Enhancing Psychological Reasoning in LLMs via Bilateral Reinforcement Learning

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit limited performance in inferring implicit mental states—such as emotions, intentions, and beliefs—primarily due to the absence of theory-aligned supervision and insufficient modeling of fine-grained psychological processes in realistic narratives. To address this, we propose a trajectory-aware bidirectional reinforcement learning framework that uniquely integrates psychology-theory-driven supervision signals with dynamic reasoning-path modeling over psychologically rich, real-world scenarios. Our method leverages expert-annotated data, imitation of human psychological reasoning trajectories, and knowledge internalization via smaller surrogate models to guide LLMs toward expert-level social-cognitive reasoning patterns. Experiments demonstrate that our approach achieves human-expert-level interpretability across multiple benchmarks, while significantly improving out-of-distribution generalization and continual learning performance across diverse psychological reasoning tasks.

Technology Category

Application Category

📝 Abstract
Large Language Models show promise in emotion understanding, social reasoning, and empathy, yet they struggle with psychologically grounded tasks that require inferring implicit mental states in context-rich, ambiguous settings. These limitations arise from the absence of theory-aligned supervision and the difficulty of capturing nuanced mental processes in real-world narratives. To address this gap, we leverage expert-labeled, psychologically rich scenarios and propose a trajectory-aware reinforcement learning framework that explicitly imitates expert psychological thought patterns. By integrating real-world stimuli with structured reasoning guidance, our approach enables compact models to internalize social-cognitive principles, perform nuanced psychological inference, and support continual self-improvement. Comprehensive experiments across multiple benchmarks further demonstrate that our models achieve expert-level interpretive capabilities, exhibiting strong out-of-distribution generalization and robust continual learning across diverse, challenging, and psychologically grounded tasks.
Problem

Research questions and friction points this paper is trying to address.

Enhancing psychological reasoning in LLMs for ambiguous contexts
Addressing lack of theory-aligned supervision in mental state inference
Improving nuanced psychological inference in real-world narratives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Expert-labeled psychologically rich scenarios
Trajectory-aware reinforcement learning framework
Integrating stimuli with structured reasoning guidance