Learning Implicit Social Navigation Behavior using Deep Inverse Reinforcement Learning

📅 2025-01-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses social navigation in dynamic, densely populated indoor environments. We propose Smoothed Maximum Entropy Deep Inverse Reinforcement Learning (S-MEDIRL), the first method to jointly infer an implicit social reward map from sparse human trajectory demonstrations and scene geometry. Unlike rule-based approaches, S-MEDIRL implicitly learns navigational constraints and unspoken social norms—including yielding, lateral evasion, and deadlock avoidance—without handcrafted programming. Leveraging trajectory modeling and geometric encoding, it integrates with an optimization-based local navigation controller. Evaluated in photorealistic narrow-corridor head-on encounter scenarios, S-MEDIRL significantly outperforms ORCA and rule-based baselines: deadlock rate decreases by 76%, and the system achieves autonomous, human-like, socially compliant navigation. The framework further supports cross-scene generalization and behavioral extrapolation beyond observed demonstrations.

Technology Category

Application Category

📝 Abstract
This paper reports on learning a reward map for social navigation in dynamic environments where the robot can reason about its path at any time, given agents' trajectories and scene geometry. Humans navigating in dense and dynamic indoor environments often work with several implied social rules. A rule-based approach fails to model all possible interactions between humans, robots, and scenes. We propose a novel Smooth Maximum Entropy Deep Inverse Reinforcement Learning (S-MEDIRL) algorithm that can extrapolate beyond expert demos to better encode scene navigability from few-shot demonstrations. The agent learns to predict the cost maps reasoning on trajectory data and scene geometry. The agent samples a trajectory that is then executed using a local crowd navigation controller. We present results in a photo-realistic simulation environment, with a robot and a human navigating a narrow crossing scenario. The robot implicitly learns to exhibit social behaviors such as yielding to oncoming traffic and avoiding deadlocks. We compare the proposed approach to the popular model-based crowd navigation algorithm ORCA and a rule-based agent that exhibits yielding.
Problem

Research questions and friction points this paper is trying to address.

Social Behavior Learning
Crowd Navigation
Robotics
Innovation

Methods, ideas, or system contributions that make the work stand out.

S-MEDIRL
Crowd Navigation
Social Interaction Learning
🔎 Similar Papers
No similar papers found.