π€ AI Summary
End-to-end autonomous driving systems face three key challenges in complex scenarios: semantic noise interference, suboptimal planning due to multi-task coupling, and safety risks arising from high inference latency. To address these, this paper proposes ExpertADβa novel framework comprising three core components. First, a Perception Adapter (PA) is introduced to enhance critical semantic features while suppressing noise. Second, a Sparse Mixture-of-Experts (MoSE) architecture is designed to decouple perception and planning subtasks, thereby mitigating cross-task interference. Third, a context-aware end-to-end Mixture-of-Experts (MoE) structure enables efficient, multi-skill collaborative decision-making. Extensive experiments on the CARLA benchmark demonstrate that ExpertAD reduces the average collision rate by 20% and inference latency by 25%, while exhibiting strong generalization to rare scenarios and unseen urban environments.
π Abstract
Recent advancements in end-to-end autonomous driving systems (ADSs) underscore their potential for perception and planning capabilities. However, challenges remain. Complex driving scenarios contain rich semantic information, yet ambiguous or noisy semantics can compromise decision reliability, while interference between multiple driving tasks may hinder optimal planning. Furthermore, prolonged inference latency slows decision-making, increasing the risk of unsafe driving behaviors. To address these challenges, we propose ExpertAD, a novel framework that enhances the performance of ADS with Mixture of Experts (MoE) architecture. We introduce a Perception Adapter (PA) to amplify task-critical features, ensuring contextually relevant scene understanding, and a Mixture of Sparse Experts (MoSE) to minimize task interference during prediction, allowing for effective and efficient planning. Our experiments show that ExpertAD reduces average collision rates by up to 20% and inference latency by 25% compared to prior methods. We further evaluate its multi-skill planning capabilities in rare scenarios (e.g., accidents, yielding to emergency vehicles) and demonstrate strong generalization to unseen urban environments. Additionally, we present a case study that illustrates its decision-making process in complex driving scenarios.