Learning Social Navigation from Positive and Negative Demonstrations and Rule-Based Specifications

📅 2025-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of achieving both behavioral adaptability and safety compliance in social navigation for mobile robots operating in dynamic human environments, this paper proposes an end-to-end policy learning framework integrating positive/negative example learning with rule-guided optimization. Methodologically: (1) a density-based reward function is designed to learn crowd interaction preferences from expert demonstrations; (2) hard safety constraints—such as minimum separation distance and collision-avoidance directionality—are explicitly encoded into both the reward shaping and objective function; (3) an uncertainty-aware teacher–student distillation mechanism, coupled with a sampling-based look-ahead controller, is employed to train a lightweight student policy. Evaluated on synthetic scenarios and elevator co-riding simulations, our approach improves navigation success rate by +18.7% and reduces average task completion time by 23.4%. Deployment feasibility is further validated on real robotic platforms. The core contribution lies in the synergistic modeling of data-driven flexibility and rule-enforced safety, realized through efficient distillation.

Technology Category

Application Category

📝 Abstract
Mobile robot navigation in dynamic human environments requires policies that balance adaptability to diverse behaviors with compliance to safety constraints. We hypothesize that integrating data-driven rewards with rule-based objectives enables navigation policies to achieve a more effective balance of adaptability and safety. To this end, we develop a framework that learns a density-based reward from positive and negative demonstrations and augments it with rule-based objectives for obstacle avoidance and goal reaching. A sampling-based lookahead controller produces supervisory actions that are both safe and adaptive, which are subsequently distilled into a compact student policy suitable for real-time operation with uncertainty estimates. Experiments in synthetic and elevator co-boarding simulations show consistent gains in success rate and time efficiency over baselines, and real-world demonstrations with human participants confirm the practicality of deployment. A video illustrating this work can be found on our project page https://chanwookim971024.github.io/PioneeR/.
Problem

Research questions and friction points this paper is trying to address.

Learning robot navigation from demonstrations and rule-based safety specifications
Balancing adaptability and safety in dynamic human environments
Developing real-time policies with uncertainty estimates for deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning rewards from positive and negative demonstrations
Augmenting rewards with rule-based safety objectives
Distilling safe actions into real-time student policy
🔎 Similar Papers
No similar papers found.
C
Chanwoo Kim
Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea
J
Jihwan Yoon
Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea
Hyeonseong Kim
Hyeonseong Kim
Ph.D student in KAIST
computer vision3D visionscene parsing
T
Taemoon Jeong
Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea
C
Changwoo Yoo
Department of Computer Science and Engineering, Seoul, Republic of Korea
S
Seungbeen Lee
Department of Artificial Intelligence, Yonsei University, Seoul, Republic of Korea
S
Seungbeen Lee
Robotics Institute, School of Computer Science at Carnegie Mellon University, Pittsburgh, PA, USA
S
Soohwan Byeon
Mobinn, Suwon, Republic of Korea
H
Hoon Chung
Mobinn, Suwon, Republic of Korea
Matthew Pan
Matthew Pan
Queen's University
Human-Robot Interaction
Jean Oh
Jean Oh
Robotics Institute, Carnegie Mellon University
RoboticsMultimodal PerceptionSocial NavigationLanguage-Vision intersectionArtificial Intelligence
K
Kyungjae Lee
Department of Statistics, Korea University, Seoul, Republic of Korea
Sungjoon Choi
Sungjoon Choi
Korea University
Robotics