HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing theory of mind (ToM) approaches struggle to scale to complex real-world scenarios and are often confined to small grid-world environments. This work proposes HiVAE, a hierarchical variational autoencoder inspired by the human belief-desire-intention cognitive architecture, which introduces hierarchical latent variables into scalable ToM modeling for the first time. By employing a self-supervised alignment strategy to enhance the semantic interpretability of latent representations, HiVAE significantly outperforms baseline models on a large-scale campus navigation task comprising 3,185 nodes. The model not only improves inference of agents’ implicit goals and mental states but also reveals critical challenges and new research directions concerning the explicit alignment between latent space structures and genuine psychological states.

Technology Category

Application Category

📝 Abstract
Theory of mind (ToM) enables AI systems to infer agents' hidden goals and mental states, but existing approaches focus mainly on small human understandable gridworld spaces. We introduce HiVAE, a hierarchical variational architecture that scales ToM reasoning to realistic spatiotemporal domains. Inspired by the belief-desire-intention structure of human cognition, our three-level VAE hierarchy achieves substantial performance improvements on a 3,185-node campus navigation task. However, we identify a critical limitation: while our hierarchical structure improves prediction, learned latent representations lack explicit grounding to actual mental states. We propose self-supervised alignment strategies and present this work to solicit community feedback on grounding approaches.
Problem

Research questions and friction points this paper is trying to address.

Theory of Mind
scalability
latent representation
mental states
grounding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical VAE
Theory of Mind
Latent Representation
Self-supervised Alignment
Scalable Reasoning
🔎 Similar Papers
No similar papers found.
N
Nigel Doering
University of California San Diego, School of Computing, Information, and Data Sciences
R
Rahath Malladi
University of California San Diego, School of Computing, Information, and Data Sciences
A
Arshia Sangwan
New York University
David Danks
David Danks
Professor of Data Science, Philosophy, & Policy; UC San Diego
Philosophy of scienceCognitive scienceEthics & AICausal discovery
Tauhidur Rahman
Tauhidur Rahman
Halıcıoğlu Data Science Institute, University of California San Diego
Mobile and Ubiquitous ComputingSignal ProcessingMachine LearningHealth and Behavior Modeling