Decompose-ToM: Enhancing Theory of Mind Reasoning in Large Language Models through Simulation and Task Decomposition

📅 2025-01-15

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

Large language models (LLMs) exhibit limited performance on higher-order Theory of Mind (ToM) tasks—particularly those requiring multi-step logical reasoning and recursive perspective-taking. To address this, we propose a fine-tuning-free prompting framework inspired by cognitive psychology’s Simulation Theory. Our method integrates recursive perspective simulation with structured task decomposition, including agent identification, question reformulation, dynamic world-model updating, and knowledge-availability assessment. Crucially, we introduce, for the first time in LLM-based ToM reasoning, a “pretend play” mechanism to enhance mental simulation fidelity. The approach is modular, lightweight, and requires only minimal prompt adaptation. Extensive experiments across higher-order and dialogue-based ToM benchmarks demonstrate an average accuracy improvement of 28.7% across multiple LLMs, with strong generalization and consistent superiority over state-of-the-art zero-shot and few-shot baselines.

Technology Category

Application Category

📝 Abstract

Theory of Mind (ToM) is the ability to understand and reflect on the mental states of others. Although this capability is crucial for human interaction, testing on Large Language Models (LLMs) reveals that they possess only a rudimentary understanding of it. Although the most capable closed-source LLMs have come close to human performance on some ToM tasks, they still perform poorly on complex variations of the task that involve more structured reasoning. In this work, we utilize the concept of"pretend-play", or ``Simulation Theory'' from cognitive psychology to propose ``Decompose-ToM'': an LLM-based inference algorithm that improves model performance on complex ToM tasks. We recursively simulate user perspectives and decompose the ToM task into a simpler set of functions: subject identification, question-reframing, world model updation, and knowledge availability. We test the algorithm on higher-order ToM tasks and a task testing for ToM capabilities in a conversational setting, demonstrating that our approach shows significant improvement across models compared to baseline methods while requiring minimal prompt tuning across tasks and no additional model training.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Theory of Mind

Complex Logical Reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposed ToM

Large Language Models

Complex Scenario Understanding

🔎 Similar Papers

Benchmarking Mental State Representations in Language Models