🤖 AI Summary
This work investigates the conditions under which world models can effectively learn physical dynamical systems from low-dimensional projections and concatenations—i.e., “tokenizations”—of historical frames. We establish a rigorous theoretical framework characterizing the necessary and sufficient conditions for a dynamical system to be accurately modeled via tokenized history sequences, clarifying why compact temporal representations suffice for precise future-state reconstruction and thereby bridging the theoretical gap between representation learning and dynamics modeling. Methodologically, we adopt a progressive modeling paradigm—from linear regression and shallow adversarial networks to full GANs—and validate our approach on canonical PDEs (heat equation, wave equation, chaotic Kuramoto–Sivashinsky equation) and a 2D Karman vortex street CFD dataset. Results demonstrate that high-fidelity reconstruction of complex nonlinear evolution is achievable using only compact latent sequences, significantly enhancing interpretability and out-of-distribution generalization of physics-informed world models.
📝 Abstract
In this work, we explore the use of compact latent representations with learned time dynamics ('World Models') to simulate physical systems. Drawing on concepts from control theory, we propose a theoretical framework that explains why projecting time slices into a low-dimensional space and then concatenating to form a history ('Tokenization') is so effective at learning physics datasets, and characterise when exactly the underlying dynamics admit a reconstruction mapping from the history of previous tokenized frames to the next. To validate these claims, we develop a sequence of models with increasing complexity, starting with least-squares regression and progressing through simple linear layers, shallow adversarial learners, and ultimately full-scale generative adversarial networks (GANs). We evaluate these models on a variety of datasets, including modified forms of the heat and wave equations, the chaotic regime 2D Kuramoto-Sivashinsky equation, and a challenging computational fluid dynamics (CFD) dataset of a 2D Kármán vortex street around a fixed cylinder, where our model is successfully able to recreate the flow.