Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments

📅 2026-01-03
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Existing world models struggle to achieve long-term stable and data-efficient representations in partially observable dynamic environments, particularly when the agent’s self-motion is entangled with external object motion. This work proposes a unified modeling approach that treats both types of motion as a single-parameter Lie group “flow” and constructs a flow-equivariant world model. By integrating Lie group theory, group-equivariant neural networks, diffusion models, and memory-augmented architectures, the proposed method substantially outperforms current approaches on 2D and 3D partially observable benchmarks. It demonstrates exceptional long-horizon prediction accuracy, strong generalization, and remarkable rollout stability—especially in scenarios where predictable dynamics occur outside the agent’s immediate field of view.

Technology Category

Application Category

📝 Abstract
Embodied systems experience the world as'a symphony of flows': a combination of many continuous streams of sensory input coupled to self-motion, interwoven with the dynamics of external objects. These streams obey smooth, time-parameterized symmetries, which combine through a precisely structured algebra; yet most neural network world models ignore this structure and instead repeatedly re-learn the same transformations from data. In this work, we introduce'Flow Equivariant World Models', a framework in which both self-motion and external object motion are unified as one-parameter Lie group'flows'. We leverage this unification to implement group equivariance with respect to these transformations, thereby providing a stable latent world representation over hundreds of timesteps. On both 2D and 3D partially observed video world modeling benchmarks, we demonstrate that Flow Equivariant World Models significantly outperform comparable state-of-the-art diffusion-based and memory-augmented world modeling architectures -- particularly when there are predictable world dynamics outside the agent's current field of view. We show that flow equivariance is particularly beneficial for long rollouts, generalizing far beyond the training horizon. By structuring world model representations with respect to internal and external motion, flow equivariance charts a scalable route to data efficient, symmetry-guided, embodied intelligence. Project link: https://flowequivariantworldmodels.github.io.
Problem

Research questions and friction points this paper is trying to address.

world models
partial observability
equivariance
dynamic environments
embodied intelligence
Innovation

Methods, ideas, or system contributions that make the work stand out.

flow equivariance
Lie group flows
world models
partial observability
embodied intelligence
🔎 Similar Papers
No similar papers found.
H
Hansen Lillemark
Kempner Institute, Harvard University; CSE, UC San Diego
B
Benhao Huang
ML, Carnegie Mellon University
Fangneng Zhan
Fangneng Zhan
MIT
Neural RenderingGenerative Models
Yilun Du
Yilun Du
Harvard University
Artificial IntelligenceMachine LearningRoboticsComputer Vision
T
T. A. Keller
Kempner Institute, Harvard University