Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Current autonomous driving world models exhibit limited generalization in long-horizon prediction and complex scenarios (e.g., sharp turns, dense urban traffic). To address this, we propose a lightweight hybrid tokenizer architecture that enables, for the first time, purely video-driven continuous autoregressive modeling—without auxiliary supervision or multi-sensor inputs. Our method leverages flow matching to construct continuous latent state representations and systematically compares them against discrete token-based approaches. Experiments demonstrate that, trained solely on 280 hours of unlabeled driving video with only 469M parameters, our model achieves state-of-the-art performance: it significantly outperforms existing methods in long-horizon trajectory forecasting and challenging maneuver prediction. These results validate the dual advantages of continuous modeling—superior stability and enhanced generalization—particularly under dynamic, real-world driving conditions.

Technology Category

Application Category

📝 Abstract

Existing world models for autonomous driving struggle with long-horizon generation and generalization to challenging scenarios. In this work, we develop a model using simple design choices, and without additional supervision or sensors, such as maps, depth, or multiple cameras. We show that our model yields state-of-the-art performance, despite having only 469M parameters and being trained on 280h of video data. It particularly stands out in difficult scenarios like turning maneuvers and urban traffic. We test whether discrete token models possibly have advantages over continuous models based on flow matching. To this end, we set up a hybrid tokenizer that is compatible with both approaches and allows for a side-by-side comparison. Our study concludes in favor of the continuous autoregressive model, which is less brittle on individual design choices and more powerful than the model built on discrete tokens. Code, models and qualitative results are publicly available at https://lmb-freiburg.github.io/orbis.github.io/.

Problem

Research questions and friction points this paper is trying to address.

Overcoming long-horizon prediction in driving world models

Improving generalization to challenging driving scenarios

Comparing discrete vs continuous token models for autonomy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid tokenizer for discrete and continuous model comparison

Continuous autoregressive model outperforms discrete token model

Minimal design with 469M parameters and 280h training data

🔎 Similar Papers

No similar papers found.