MAction-SocialNav: Multi-Action Socially Compliant Navigation via Reasoning-enhanced Prompt Tuning

📅 2025-12-25

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address the problem of navigation behavior homogenization arising from ambiguous social norms in human-robot coexistence scenarios, this paper proposes a novel paradigm of multi-action socially compliant navigation. Our method introduces: (1) the first dual-annotated, multi-environment multi-action navigation dataset; (2) a metacognitive prompting (MCP) mechanism to enhance the social reasoning capabilities of vision-language models; and (3) an integrated framework combining multi-turn dialogue modeling with multi-dimensional evaluation metrics—Action Preference Gain (APG) and Ethical Robustness (ER). Evaluated on a 789-sample test set, our approach achieves an APG of 0.595—significantly outperforming GPT-4o and Claude—and an ER safety score of 0.264, while sustaining a real-time inference speed of 1.524 FPS (over 3× real-time). The framework enables generation of multiple socially acceptable navigation strategies within a single scenario, advancing robust, norm-aware robotic navigation.

Technology Category

Application Category

📝 Abstract

Socially compliant navigation requires robots to move safely and appropriately in human-centered environments by respecting social norms. However, social norms are often ambiguous, and in a single scenario, multiple actions may be equally acceptable. Most existing methods simplify this problem by assuming a single correct action, which limits their ability to handle real-world social uncertainty. In this work, we propose MAction-SocialNav, an efficient vision language model for socially compliant navigation that explicitly addresses action ambiguity, enabling generating multiple plausible actions within one scenario. To enhance the model's reasoning capability, we introduce a novel meta-cognitive prompt (MCP) method. Furthermore, to evaluate the proposed method, we curate a multi-action socially compliant navigation dataset that accounts for diverse conditions, including crowd density, indoor and outdoor environments, and dual human annotations. The dataset contains 789 samples, each with three-turn conversation, split into 710 training samples and 79 test samples through random selection. We also design five evaluation metrics to assess high-level decision precision, safety, and diversity. Extensive experiments demonstrate that the proposed MAction-SocialNav achieves strong social reasoning performance while maintaining high efficiency, highlighting its potential for real-world human robot navigation. Compared with zero-shot GPT-4o and Claude, our model achieves substantially higher decision quality (APG: 0.595 vs. 0.000/0.025) and safety alignment (ER: 0.264 vs. 0.642/0.668), while maintaining real-time efficiency (1.524 FPS, over 3x faster).

Problem

Research questions and friction points this paper is trying to address.

Addresses ambiguous social norms in robot navigation

Enables generating multiple plausible actions per scenario

Enhances reasoning for real-world social uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-action navigation model addresses social ambiguity

Meta-cognitive prompt enhances reasoning capability

Dataset with diverse conditions enables robust evaluation

🔎 Similar Papers

Online Context Learning for Socially-compliant Navigation