GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving

📅 2025-07-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of balancing robustness and scene adaptability for end-to-end autonomous driving in complex, dynamic traffic environments, this paper proposes the Dual-Perception Mixture-of-Experts (MoE) framework. The method decouples global driving priors from scene-specific policies and introduces a dual-perception router that jointly models scene features and routing uncertainty, enabling dynamic activation of both global experts and scene-adaptive expert groups to synergistically fuse diverse driving skills. The model operates end-to-end from monocular visual input, with joint training and inference. Evaluated in the Bench2Drive closed-loop benchmark, it achieves state-of-the-art performance, improving driving score, success rate, and multi-capability mean by 7.67%, 22.06%, and 19.41%, respectively.

Technology Category

Application Category

📝 Abstract
End-to-end autonomous driving requires adaptive and robust handling of complex and diverse traffic environments. However, prevalent single-mode planning methods attempt to learn an overall policy while struggling to acquire diversified driving skills to handle diverse scenarios. Therefore, this paper proposes GEMINUS, a Mixture-of-Experts end-to-end autonomous driving framework featuring a Global Expert, a Scene-Adaptive Experts Group, and equipped with a Dual-aware Router. Specifically, the Global Expert is trained on the overall dataset, possessing robust performance. The Scene-Adaptive Experts are trained on corresponding scene subsets, achieving adaptive performance. The Dual-aware Router simultaneously considers scenario-level features and routing uncertainty to dynamically activate expert modules. Through the effective coupling of the Global Expert and the Scene-Adaptive Experts Group via the Dual-aware Router, GEMINUS achieves adaptive and robust performance in diverse scenarios. GEMINUS outperforms existing methods in the Bench2Drive closed-loop benchmark and achieves state-of-the-art performance in Driving Score and Success Rate, even with only monocular vision input. Furthermore, ablation studies demonstrate significant improvements over the original single-expert baseline: 7.67% in Driving Score, 22.06% in Success Rate, and 19.41% in MultiAbility-Mean. The code will be available at https://github.com/newbrains1/GEMINUS.
Problem

Research questions and friction points this paper is trying to address.

Adaptive handling of diverse traffic environments in autonomous driving
Overcoming limitations of single-mode planning with diverse driving skills
Dynamic expert module activation for robust performance in varied scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts framework with Global Expert
Scene-Adaptive Experts Group for diverse scenarios
Dual-aware Router for dynamic expert activation
🔎 Similar Papers
No similar papers found.