🤖 AI Summary
This work addresses the common oversight in existing modular end-to-end (ME2E) autonomous driving systems, which often neglect system-level metrics such as inference latency and energy consumption during optimization, and suffer from a disconnect between software and hardware design that hinders efficient deployment. To bridge this gap, we propose the first reusable hardware-software co-optimization and closed-loop evaluation framework tailored for ME2E autonomous driving. Our approach jointly optimizes model architecture, operator scheduling, and hardware acceleration under a unified objective, while introducing a multidimensional evaluation protocol encompassing safety, comfort, efficiency, latency, and energy consumption. Experimental results demonstrate that our method significantly reduces latency and energy usage across multiple ME2E systems without compromising driving performance, thereby achieving substantial improvements in overall system-level efficiency.
📝 Abstract
Modular end-to-end (ME2E) autonomous driving paradigms combine modular interpretability with global optimization capability and have demonstrated strong performance. However, existing studies mainly focus on accuracy improvement, while critical system-level factors such as inference latency and energy consumption are often overlooked, resulting in increasingly complex model designs that hinder practical deployment. Prior efforts on model compression and acceleration typically optimize either the software or hardware side in isolation. Software-only optimization cannot fundamentally remove intermediate tensor access and operator scheduling overheads, whereas hardware-only optimization is constrained by model structure and precision. As a result, the real-world benefits of such optimizations are often limited. To address these challenges, this paper proposes a reusable software and hardware co-optimization and closed-loop evaluation framework for ME2E autonomous driving inference. The framework jointly integrates software-level model optimization with hardware-level computation optimization under a unified system-level objective. In addition, a multidimensional evaluation metric is introduced to assess system performance by jointly considering safety, comfort, efficiency, latency, and energy, enabling quantitative comparison of different optimization strategies. Experiments across multiple ME2E autonomous driving stacks show that the proposed framework preserves baseline-level driving performance while significantly reducing inference latency and energy consumption, achieving substantial overall system-level improvements. These results demonstrate that the proposed framework provides practical and actionable guidance for efficient deployment of ME2E autonomous driving systems.