Discrete World Models via Regularization

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This work addresses the challenge of learning informative and dynamically accurate discrete (Boolean) world models in an unsupervised setting without relying on reconstruction or contrastive learning. To this end, the authors propose a novel framework that dispenses with decoder-based reconstruction or contrastive signals, instead leveraging a prediction loss augmented with carefully designed regularization terms—specifically variance, correlation, and coskewness penalties—to encourage independence among latent states. A locality prior is further introduced to model sparse action effects. Additionally, a new discrete rollout strategy is employed to enhance training robustness. Evaluated on two compositional benchmark environments, the method outperforms reconstruction-based baselines in both state representation quality and transition model accuracy, with performance further improved when complemented by a lightweight reconstruction decoder.

Technology Category

Application Category

📝 Abstract

World models aim to capture the states and dynamics of an environment in a compact latent space. Moreover, using Boolean state representations is particularly useful for search heuristics and symbolic reasoning and planning. Existing approaches keep latents informative via decoder-based reconstruction, or instead via contrastive or reward signals. In this work, we introduce Discrete World Models via Regularization (DWMR): a reconstruction-free and contrastive-free method for unsupervised Boolean world-model learning. In particular, we introduce a novel world-modeling loss that couples latent prediction with specialized regularizers. Such regularizers maximize the entropy and independence of the representation bits through variance, correlation, and coskewness penalties, while simultaneously enforcing a locality prior for sparse action changes. To enable effective optimization, we also introduce a novel training scheme improving robustness to discrete roll-outs. Experiments on two benchmarks with underlying combinatorial structure show that DWMR learns more accurate representations and transitions than reconstruction-based alternatives. Finally, DWMR can also be paired with an auxiliary reconstruction decoder, and this combination yields additional gains.

Problem

Research questions and friction points this paper is trying to address.

discrete world models

Boolean representations

unsupervised learning

symbolic reasoning

latent representation

Innovation

Methods, ideas, or system contributions that make the work stand out.

discrete world models

regularization

Boolean representations