🤖 AI Summary
Existing autonomous driving systems struggle to dynamically adapt to the diverse, context-dependent driving style preferences of human drivers, often constrained by predefined style templates or requiring frequent user feedback. This paper proposes an end-to-end adaptive framework based on multi-objective reinforcement learning (MORL), which— for the first time—embeds an interpretable, continuous preference weight vector directly into the policy network to decouple and regulate stylistic dimensions including efficiency, comfort, speed, and aggressiveness. The method requires no online retraining and enables real-time, runtime preference adjustment. Evaluated in the CARLA urban simulation environment, the framework integrates visual perception with preference-driven behavioral modulation, significantly enhancing style-switching flexibility. It achieves a 98.2% route completion rate and a collision rate below 0.5%, demonstrating strong generalization across preference configurations while maintaining safety-performance trade-off.
📝 Abstract
Human drivers exhibit individual preferences regarding driving style. Adapting autonomous vehicles to these preferences is essential for user trust and satisfaction. However, existing end-to-end driving approaches often rely on predefined driving styles or require continuous user feedback for adaptation, limiting their ability to support dynamic, context-dependent preferences. We propose a novel approach using multi-objective reinforcement learning (MORL) with preference-driven optimization for end-to-end autonomous driving that enables runtime adaptation to driving style preferences. Preferences are encoded as continuous weight vectors to modulate behavior along interpretable style objectives$unicode{x2013}$including efficiency, comfort, speed, and aggressiveness$unicode{x2013}$without requiring policy retraining. Our single-policy agent integrates vision-based perception in complex mixed-traffic scenarios and is evaluated in diverse urban environments using the CARLA simulator. Experimental results demonstrate that the agent dynamically adapts its driving behavior according to changing preferences while maintaining performance in terms of collision avoidance and route completion.