🤖 AI Summary
How can we quantitatively model unobservable beliefs underlying others’ behavior? This work introduces Language-augmented Bayesian Theory of Mind (LaBToM), the first framework to tightly integrate large language model (LLM)-driven natural-language-to-mental-representation translation with Bayesian inverse inference. LaBToM employs syntax-constrained LLM decoding to map natural language—including modal verbs, uncertainty expressions, and false-belief statements—into structured mental representations, then couples these with a generative rational action model to enable fine-grained, probabilistic belief credibility estimation. In a maze-navigation belief assessment task, LaBToM achieves significantly higher correlation with human judgments than multimodal foundation models (e.g., GPT-4o, Gemini Pro) and all ablation baselines, robustly handling five canonical classes of cognitive linguistic expressions. The framework establishes an interpretable, empirically testable paradigm for joint language–cognition modeling in computational theory of mind.
📝 Abstract
How do people understand and evaluate claims about others' beliefs, even though these beliefs cannot be directly observed? In this paper, we introduce a cognitive model of epistemic language interpretation, grounded in Bayesian inferences about other agents' goals, beliefs, and intentions: a language-augmented Bayesian theory-of-mind (LaBToM). By translating natural language into an epistemic ``language-of-thought'' with grammar-constrained LLM decoding, then evaluating these translations against the inferences produced by inverting a generative model of rational action and perception, LaBToM captures graded plausibility judgments of epistemic claims. We validate our model in an experiment where participants watch an agent navigate a maze to find keys hidden in boxes needed to reach their goal, then rate sentences about the agent's beliefs. In contrast with multimodal LLMs (GPT-4o, Gemini Pro) and ablated models, our model correlates highly with human judgments for a wide range of expressions, including modal language, uncertainty expressions, knowledge claims, likelihood comparisons, and attributions of false belief.