From Kepler to Newton: Inductive Biases Guide Learned World Models in Transformers

📅 2026-02-06
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a Transformer-based world model that, despite lacking explicit physical priors, successfully discovers fundamental physical laws from observational data alone. By incorporating only three minimal inductive biases—spatial smoothness, training stability, and temporal locality—the model automatically learns Keplerian elliptical orbits from planetary trajectory data and further emerges with force representations consistent with Newtonian gravitation. This represents the first end-to-end discovery of physical laws directly from raw observations, demonstrating that careful architectural design can endow general-purpose AI models with genuine physical insight. The approach transcends conventional predictive modeling by uncovering underlying causal structures rather than merely fitting data correlations, highlighting the critical role of model architecture in enabling scientific discovery.

Technology Category

Application Category

📝 Abstract
Can general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on"world models"-- causal abstractions that allow an agent to not only predict future states but understand the underlying governing dynamics. While previous"AI Physicist"approaches have successfully recovered such laws, they typically rely on strong, domain-specific priors that effectively"bake in"the physics. Conversely, Vafa et al. recently showed that generic Transformers fail to acquire these world models, achieving high predictive accuracy without capturing the underlying physical laws. We bridge this gap by systematically introducing three minimal inductive biases. We show that ensuring spatial smoothness (by formulating prediction as continuous regression) and stability (by training with noisy contexts to mitigate error accumulation) enables generic Transformers to surpass prior failures and learn a coherent Keplerian world model, successfully fitting ellipses to planetary trajectories. However, true physical insight requires a third bias: temporal locality. By restricting the attention window to the immediate past -- imposing the simple assumption that future states depend only on the local state rather than a complex history -- we force the model to abandon curve-fitting and discover Newtonian force representations. Our results demonstrate that simple architectural choices determine whether an AI becomes a curve-fitter or a physicist, marking a critical step toward automated scientific discovery.
Problem

Research questions and friction points this paper is trying to address.

world models
physical laws
inductive biases
scientific discovery
Transformers
Innovation

Methods, ideas, or system contributions that make the work stand out.

inductive biases
world models
Transformers
temporal locality
scientific discovery
🔎 Similar Papers
No similar papers found.