🤖 AI Summary
This paper addresses the challenge of jointly personalizing user comfort, interaction preferences, and safety requirements in intelligent vehicle cockpits. To this end, we propose Sage Deer, a super-aligned general driving agent. Methodologically, we introduce the first three-dimensional super-alignment framework integrating user preferences, physiological states (e.g., heart rate, eye movement), and action intentions; design a self-triggered implicit chain-of-thought mechanism enabling cross-modal (vision, time-series, language) reasoning and preference-decoupled modeling; and establish the first cockpit-oriented multi-dimensional alignment benchmark alongside a large-scale, multi-source dataset. Experimental results on our custom benchmark demonstrate significant improvements: +12.7% accuracy in physiological state recognition, +18.3% consistency in joint emotion–behavior decision-making, and +24.1% personalized response matching—substantially advancing individualized, interpretable, and safety-aligned driving agents.
📝 Abstract
The intelligent driving cockpit, an important part of intelligent driving, needs to match different users' comfort, interaction, and safety needs. This paper aims to build a Super-Aligned and GEneralist DRiving agent, SAGE DeeR. Sage Deer achieves three highlights: (1) Super alignment: It achieves different reactions according to different people's preferences and biases. (2) Generalist: It can understand the multi-view and multi-mode inputs to reason the user's physiological indicators, facial emotions, hand movements, body movements, driving scenarios, and behavioral decisions. (3) Self-Eliciting: It can elicit implicit thought chains in the language space to further increase generalist and super-aligned abilities. Besides, we collected multiple data sets and built a large-scale benchmark. This benchmark measures the deer's perceptual decision-making ability and the super alignment's accuracy.