What if? Emulative Simulation with World Models for Situated Reasoning

πŸ“… 2026-03-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of reasoning about counterfactual spatial questions in real-world settings where active exploration is infeasible due to physical constraints or safety concerns. To this end, the authors propose WanderDream, the first large-scale dataset for mental exploration, which leverages a world model to enable embodied agents to simulate trajectories from their current viewpoint to hypothetical target situations entirely β€œin the mind,” without physical movement. The dataset is constructed by synthesizing panoramic videos and associated spatial question-answer pairs, supporting both trajectory generation and cross-scene transfer evaluation in real environments. Experimental results demonstrate that mental exploration substantially enhances the embodied reasoning capabilities of multimodal large language models in real-world scenarios, validating the effectiveness and generalizability of the proposed approach.

Technology Category

Application Category

πŸ“ Abstract
Situated reasoning often relies on active exploration, yet in many real-world scenarios such exploration is infeasible due to physical constraints of robots or safety concerns of visually impaired users. Given only a limited observation, can an agent mentally simulate a future trajectory toward a target situation and answer spatial what-if questions? We introduce WanderDream, the first large-scale dataset designed for the emulative simulation of mental exploration, enabling models to reason without active exploration. WanderDream-Gen comprises 15.8K panoramic videos across 1,088 real scenes from HM3D, ScanNet++, and real-world captures, depicting imagined trajectories from current viewpoints to target situations. WanderDream-QA contains 158K question-answer pairs, covering starting states, paths, and end states along each trajectory to comprehensively evaluate exploration-based reasoning. Extensive experiments with world models and MLLMs demonstrate (1) that mental exploration is essential for situated reasoning, (2) that world models achieve compelling performance on WanderDream-Gen, (3) that imagination substantially facilitates reasoning on WanderDream-QA, and (4) that WanderDream data exhibit remarkable transferability to real-world scenarios. The source code and all data will be released.
Problem

Research questions and friction points this paper is trying to address.

situated reasoning
mental simulation
what-if questions
active exploration
world models
Innovation

Methods, ideas, or system contributions that make the work stand out.

emulative simulation
mental exploration
world models
situated reasoning
what-if reasoning
πŸ”Ž Similar Papers
No similar papers found.