When Does LeJEPA Learn a World Model?

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

295K/year

🤖 AI Summary

This work addresses the challenge of reliably recovering latent variables under nonlinear observations, a key obstacle to effective planning and compositional generalization. The authors propose LeJEPA, a method that integrates alignment loss, Gaussian regularization, and orthogonal linear mappings to achieve linearly identifiable recovery of latent states in environments with stationary additive noise transitions. They establish, for the first time, that the Gaussian distribution is the unique latent distribution guaranteeing linear identifiability and develop a theory of approximate identifiability, providing rigorous mathematical foundations for world models. Empirical validation spans latent spaces from 2D to 1024 dimensions, including ablation studies on distributional assumptions and pixel-based robotic control tasks, demonstrating substantial improvements in latent-space planning performance.

📝 Abstract

A representation that scrambles the true degrees of freedom of the world cannot support reliable planning or compositional generalization. We prove that LeJEPA (alignment plus Gaussian regularization) linearly recovers the world's latent variables from nonlinear observations, a property known as linear identifiability, in a broad class of worlds where latents evolve under stationary, additive-noise transitions. Our main result is that among all such worlds, the Gaussian is the unique latent distribution for which this guarantee holds. The forward direction rests on a spectral decomposition in which each degree of nonlinearity is strictly penalized by alignment, making the linear map the optimum; the converse rules out every non-Gaussian alternative. We further prove an approximate identifiability result where the guarantee degrades gracefully, and show that linear, orthogonal identifiability enables optimal latent-space planning. We validate the theory with experiments ranging from 2D examples to 1024-dimensional latents, including distributional ablations and pixel-based robotic control. Our theory turns an empirically successful recipe into a mathematical guarantee, providing the foundation for building World Models that provably recover the structure of the world.

Problem

Research questions and friction points this paper is trying to address.

world model

linear identifiability

latent variables

nonlinear observations

Gaussian regularization

Innovation

Methods, ideas, or system contributions that make the work stand out.

LeJEPA

linear identifiability

world model