🤖 AI Summary
This work addresses the unreliability of world models in counterfactual dynamics prediction under distribution shifts and interventions. To this end, the authors propose integrating CausalVAE as a plug-and-play module into diverse encoder-transition backbone architectures. This approach preserves strong factual prediction performance while, for the first time, enabling interpretable learning of latent causal structures. The method substantially enhances model robustness and counterfactual reasoning capabilities under interventions, achieving a 102.5% average improvement in CF-H@1 on Physics benchmarks. Notably, under a specific GNN-NLL configuration, the CF-H@1 score increases from 11.0 to 41.0, representing a 272.7% relative gain.
📝 Abstract
In this work, CausalVAE is introduced as a plug-in structural module for latent world models and is attached to diverse encoder-transition backbones. Across the reported benchmarks, competitive factual prediction is preserved and intervention-aware counterfactual retrieval is improved after the plug-in is added, suggesting stronger robustness under distribution shift and interventions. The largest gains are observed on the Physics benchmark: when averaged over 8 paired baselines, CF-H@1 is improved by +102.5%. In a representative GNN-NLL setting on Physics, CF-H@1 is increased from 11.0 to 41.0 (+272.7%). Through causal analysis, learned structural dependencies are shown to recover meaningful first-order physical interaction trends, supporting the interpretability of the learned latent causal structure.