🤖 AI Summary
Rigid group-equivariant constraints in physical system modeling degrade performance when symmetries are broken—common in real-world asymmetric dynamics.
Method: This paper proposes an object-centric world model for multi-object scenes, introducing geometric algebra (GA) into dynamic modeling for the first time. Instead of hard equivariance constraints, it employs soft geometric inductive biases encoded via GA, preserving structural priors while accommodating physical asymmetry. The framework integrates geometric algebra neural networks, object-centric representations, and autoregressive temporal modeling.
Results: Evaluated on 2D rigid-body dynamics simulation, the method achieves high-fidelity one-step-ahead frame prediction. Experiments demonstrate significantly improved long-horizon physical fidelity over non-equivariant baselines, enhanced sample efficiency, and greater robustness in modeling multi-object interactions. These results validate geometric algebra as an interpretable, physics-grounded intermediate representation bridging physical principles and deep learning—with strong generalization potential.
📝 Abstract
Equivariance is a powerful prior for learning physical dynamics, yet exact group equivariance can degrade performance if the symmetries are broken. We propose object-centric world models built with geometric algebra neural networks, providing a soft geometric inductive bias. Our models are evaluated using simulated environments of 2d rigid body dynamics with static obstacles, where we train for next-step predictions autoregressively. For long-horizon rollouts we show that the soft inductive bias of our models results in better performance in terms of physical fidelity compared to non-equivariant baseline models. The approach complements recent soft-equivariance ideas and aligns with the view that simple, well-chosen priors can yield robust generalization. These results suggest that geometric algebra offers an effective middle ground between hand-crafted physics and unstructured deep nets, delivering sample-efficient dynamics models for multi-object scenes.