Addressing Corner Cases in Autonomous Driving: A World Model-based Approach with Mixture of Experts and LLMs

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Autonomous driving systems exhibit insufficient prediction performance in rare yet safety-critical edge cases—primarily due to training data bias and limited model generalization. To address this, we propose a novel framework integrating a world model, Mixture-of-Experts (MoE), and a large language model (LLM). Our approach introduces a first-of-its-kind *scenario routing mechanism* to decompose complex edge cases, and a lightweight temporal tokenizer enabling zero-shot spatiotemporal context fusion and causal counterfactual reasoning. Notably, this is the first work to leverage LLMs to enhance the long-horizon reasoning capability of world models. To standardize evaluation, we release *nuScenes-corner*, a new benchmark dedicated to edge-case prediction. Experiments demonstrate state-of-the-art performance across four diverse datasets—nuScenes, NGSIM, HighD, and MoCAD—with significant robustness improvements under both edge-case conditions and data scarcity.

Technology Category

Application Category

📝 Abstract
Accurate and reliable motion forecasting is essential for the safe deployment of autonomous vehicles (AVs), particularly in rare but safety-critical scenarios known as corner cases. Existing models often underperform in these situations due to an over-representation of common scenes in training data and limited generalization capabilities. To address this limitation, we present WM-MoE, the first world model-based motion forecasting framework that unifies perception, temporal memory, and decision making to address the challenges of high-risk corner-case scenarios. The model constructs a compact scene representation that explains current observations, anticipates future dynamics, and evaluates the outcomes of potential actions. To enhance long-horizon reasoning, we leverage large language models (LLMs) and introduce a lightweight temporal tokenizer that maps agent trajectories and contextual cues into the LLM's feature space without additional training, enriching temporal context and commonsense priors. Furthermore, a mixture-of-experts (MoE) is introduced to decompose complex corner cases into subproblems and allocate capacity across scenario types, and a router assigns scenes to specialized experts that infer agent intent and perform counterfactual rollouts. In addition, we introduce nuScenes-corner, a new benchmark that comprises four real-world corner-case scenarios for rigorous evaluation. Extensive experiments on four benchmark datasets (nuScenes, NGSIM, HighD, and MoCAD) showcase that WM-MoE consistently outperforms state-of-the-art (SOTA) baselines and remains robust under corner-case and data-missing conditions, indicating the promise of world model-based architectures for robust and generalizable motion forecasting in fully AVs.
Problem

Research questions and friction points this paper is trying to address.

Improves motion forecasting for autonomous vehicles in rare safety-critical scenarios
Addresses limited generalization of existing models in corner cases
Enhances long-horizon reasoning using LLMs and mixture-of-experts architecture
Innovation

Methods, ideas, or system contributions that make the work stand out.

World model framework unifies perception, memory, decision making
LLMs enhance reasoning via lightweight temporal tokenizer mapping
Mixture-of-experts decomposes corner cases into specialized subproblems
🔎 Similar Papers
No similar papers found.
H
Haicheng Liao
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macau SAR, China; Department of Computer and Information Science, University of Macau, Macau SAR, China
Bonan Wang
Bonan Wang
Unknown affiliation
Trajectory Prediction
J
Junxian Yang
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macau SAR, China
C
Chengyue Wang
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macau SAR, China; Department of Civil and Environmental Engineering, University of Macau, Macau SAR, China
Z
Zhengbin He
Senseable City Lab, Massachusetts Institute of Technology, Cambridge MA, United States
Guohui Zhang
Guohui Zhang
Professor of Civil Engineering, University of Hawaii
Traffic EngineeringITSTraffic DetectionTraffic System ModelingSimulation
C
Chengzhong Xu
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macau SAR, China; Department of Computer and Information Science, University of Macau, Macau SAR, China
Z
Zhenning Li
State Key Laboratory of Internet of Things for Smart City, University of Macau, Macau SAR, China; Department of Computer and Information Science, University of Macau, Macau SAR, China; Department of Civil and Environmental Engineering, University of Macau, Macau SAR, China