ToM-RL: Reinforcement Learning Unlocks Theory of Mind in Small LLMs

📅 2025-04-02

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses the limited Theory of Mind (ToM) capability in small- to medium-scale LLMs (0.5B–7B parameters) for social reasoning. We propose a rule-based post-training reinforcement learning framework integrated with belief tracking and a lightweight, structured ToM dataset (3,200 samples), enabling stable activation of high-order ToM reasoning in models of this scale for the first time. Our method significantly improves out-of-distribution generalization and cross-format robustness: on the Hi-ToM benchmark, the 7B model achieves 84.50% accuracy—surpassing both GPT-4o and DeepSeek-v3, and outperforming state-of-the-art models of comparable size. The core contribution lies in demonstrating that RL can effectively endow small-scale models with robust ToM capabilities, overcoming structural bottlenecks in complex social reasoning. This advances trustworthy social AI for resource-constrained settings, establishing a novel paradigm grounded in efficient, interpretable, and scalable ToM enhancement.

Technology Category

Application Category

📝 Abstract

Recent advancements in rule-based reinforcement learning (RL), applied during the post-training phase of large language models (LLMs), have significantly enhanced their capabilities in structured reasoning tasks such as mathematics and logical inference. However, the effectiveness of RL in social reasoning, particularly in Theory of Mind (ToM), the ability to infer others' mental states, remains largely unexplored. In this study, we demonstrate that RL methods effectively unlock ToM reasoning capabilities even in small-scale LLMs (0.5B to 7B parameters). Using a modest dataset comprising 3200 questions across diverse scenarios, our RL-trained 7B model achieves 84.50% accuracy on the Hi-ToM benchmark, surpassing models like GPT-4o and DeepSeek-v3 despite significantly fewer parameters. While smaller models ($leq$3B parameters) suffer from reasoning collapse, larger models (7B parameters) maintain stable performance through consistent belief tracking. Additionally, our RL-based models demonstrate robust generalization to higher-order, out-of-distribution ToM problems, novel textual presentations, and previously unseen datasets. These findings highlight RL's potential to enhance social cognitive reasoning, bridging the gap between structured problem-solving and nuanced social inference in LLMs.

Problem

Research questions and friction points this paper is trying to address.

Enhancing Theory of Mind in small LLMs using RL

Exploring RL's effectiveness in social reasoning tasks

Bridging structured reasoning and social inference gaps

Innovation

Methods, ideas, or system contributions that make the work stand out.

RL enhances small LLMs' Theory of Mind

Modest dataset achieves high ToM accuracy

RL enables robust generalization in ToM

🔎 Similar Papers

Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective