🤖 AI Summary
Embedded autonomous driving platforms face conflicting demands of limited computational resources and heterogeneous accuracy requirements across diverse operational scenarios.
Method: This paper proposes a task-specific, learning-based adaptive semantic segmentation deployment framework. It introduces a three-level tunable mechanism—adjusting width multipliers, classifier depth, and convolutional kernel sizes—integrated with Bayesian optimization and a hardware-aware surrogate model to perform fine-grained, scenario-customized joint hyperparameter search under MACs constraints.
Contribution/Results: The framework uniquely unifies dynamically scalable network architectures with hardware-aware automated tuning, enabling co-optimization of model size and accuracy. Experiments on embedded platforms—including NVIDIA DRIVE PX 2—demonstrate substantial improvements in both resource utilization efficiency and segmentation accuracy, facilitating efficient, multi-task customized deployment.
📝 Abstract
Autonomous driving platforms encounter diverse driving scenarios, each with varying hardware resources and precision requirements. Given the computational limitations of embedded devices, it is crucial to consider computing costs when deploying on target platforms like the NVIDIA extsuperscript{ extregistered} DRIVE PX 2. Our objective is to customize the semantic segmentation network according to the computing power and specific scenarios of autonomous driving hardware. We implement dynamic adaptability through a three-tier control mechanism -- width multiplier, classifier depth, and classifier kernel -- allowing fine-grained control over model components based on hardware constraints and task requirements. This adaptability facilitates broad model scaling, targeted refinement of the final layers, and scenario-specific optimization of kernel sizes, leading to improved resource allocation and performance.
Additionally, we leverage Bayesian Optimization with surrogate modeling to efficiently explore hyperparameter spaces under tight computational budgets. Our approach addresses scenario-specific and task-specific requirements through automatic parameter search, accommodating the unique computational complexity and accuracy needs of autonomous driving. It scales its Multiply-Accumulate Operations (MACs) for Task-Specific Learning Adaptation (TSLA), resulting in alternative configurations tailored to diverse self-driving tasks. These TSLA customizations maximize computational capacity and model accuracy, optimizing hardware utilization.