Latent World Models for Automated Driving: A Unified Taxonomy, Evaluation Framework, and Open Challenges

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the absence of a unified latent world model framework in autonomous driving, which hinders the joint optimization of simulation, prediction, and decision-making and leads to a disconnect between open-loop and closed-loop evaluation. The paper proposes the first unified latent-space framework for autonomous driving, systematically exploring the design space of latent representations—including continuous and discrete formulations of latent worlds, actions, and generators—and integrating geometric, topological, and semantic priors. It introduces five core mechanisms to enhance robustness, generalization, and deployment feasibility. Key contributions include the first unified taxonomy and closed-loop evaluation protocol, a resource-aware inference cost metric, five cross-domain design principles (e.g., structural isomorphism and long-term temporal stability), and an evaluation standard for quantifying deployment readiness, significantly narrowing the open-loop–closed-loop performance gap and offering a clear pathway toward verifiable, resource-efficient, decision-capable autonomous driving systems.

Technology Category

Application Category

📝 Abstract

Emerging generative world models and vision-language-action (VLA) systems are rapidly reshaping automated driving by enabling scalable simulation, long-horizon forecasting, and capability-rich decision making. Across these directions, latent representations serve as the central computational substrate: they compress high-dimensional multi-sensor observations, enable temporally coherent rollouts, and provide interfaces for planning, reasoning, and controllable generation. This paper proposes a unifying latent-space framework that synthesizes recent progress in world models for automated driving. The framework organizes the design space by the target and form of latent representations (latent worlds, latent actions, latent generators; continuous states, discrete tokens, and hybrids) and by structural priors for geometry, topology, and semantics. Building on this taxonomy, the paper articulates five cross-cutting internal mechanics (i.e, structural isomorphism, long-horizon temporal stability, semantic and reasoning alignment, value-aligned objectives and post-training, as well as adaptive computation and deliberation) and connects these design choices to robustness, generalization, and deployability. The work also proposes concrete evaluation prescriptions, including a closed-loop metric suite and a resource-aware deliberation cost, designed to reduce the open-loop / closed-loop mismatch. Finally, the paper identifies actionable research directions toward advancing latent world model for decision-ready, verifiable, and resource-efficient automated driving.

Problem

Research questions and friction points this paper is trying to address.

latent world models

automated driving

evaluation framework

taxonomy

open challenges

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent World Models

Unified Taxonomy

Closed-loop Evaluation