Sage Deer: A Super-Aligned Driving Generalist Is Your Copilot

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of jointly personalizing user comfort, interaction preferences, and safety requirements in intelligent vehicle cockpits. To this end, we propose Sage Deer, a super-aligned general driving agent. Methodologically, we introduce the first three-dimensional super-alignment framework integrating user preferences, physiological states (e.g., heart rate, eye movement), and action intentions; design a self-triggered implicit chain-of-thought mechanism enabling cross-modal (vision, time-series, language) reasoning and preference-decoupled modeling; and establish the first cockpit-oriented multi-dimensional alignment benchmark alongside a large-scale, multi-source dataset. Experimental results on our custom benchmark demonstrate significant improvements: +12.7% accuracy in physiological state recognition, +18.3% consistency in joint emotion–behavior decision-making, and +24.1% personalized response matching—substantially advancing individualized, interpretable, and safety-aligned driving agents.

Technology Category

Application Category

📝 Abstract
The intelligent driving cockpit, an important part of intelligent driving, needs to match different users' comfort, interaction, and safety needs. This paper aims to build a Super-Aligned and GEneralist DRiving agent, SAGE DeeR. Sage Deer achieves three highlights: (1) Super alignment: It achieves different reactions according to different people's preferences and biases. (2) Generalist: It can understand the multi-view and multi-mode inputs to reason the user's physiological indicators, facial emotions, hand movements, body movements, driving scenarios, and behavioral decisions. (3) Self-Eliciting: It can elicit implicit thought chains in the language space to further increase generalist and super-aligned abilities. Besides, we collected multiple data sets and built a large-scale benchmark. This benchmark measures the deer's perceptual decision-making ability and the super alignment's accuracy.
Problem

Research questions and friction points this paper is trying to address.

Develop a super-aligned driving agent for personalized user needs
Create a generalist system processing multi-modal inputs for driving scenarios
Enhance agent abilities through self-eliciting implicit thought chains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Super alignment for personalized user reactions
Generalist handling multi-view multi-mode inputs
Self-eliciting implicit thought chains enhancement
🔎 Similar Papers
No similar papers found.
L
LU Hao
HKUST, HKUST-GZ
J
Jiaqi Tang
HKUST, HKUST-GZ
Jiyao Wang
Jiyao Wang
Postdoc, McGill University
human factors in automationstate monitoringphysiological measurement
Y
Yunfan LU
HKUST, HKUST-GZ
X
Xu Cao
HKUST, HKUST-GZ
Qingyong Hu
Qingyong Hu
Ph.D. of Computer Science, University of Oxford
3D VisionPhotogrammetryPoint Cloud ProcessingAutonomous Driving
Y
Yin Wang
ZJU
Yuting Zhang
Yuting Zhang
HKUST(GZ)
rPPGComputer Vision
T
Tianxin Xie
HKUST, HKUST-GZ
Y
Yunpeng Zhang
Phigent
Y
Yong Chen
GEELY
J
Jiayu Gao
GEELY
B
Bin Huang Retrieval-Augmented
HKUST, HKUST-GZ
D
Dengbo He Retrieval-Augmented
HKUST, HKUST-GZ
S
Shuiguang Deng Generation
ZJU
H
Hao Chen
HKUST, HKUST-GZ
Ying-Cong Chen
Ying-Cong Chen
Hong Kong University of Science and Technology (Guangzhou)
Computer Vision and Pattern Recognition