🤖 AI Summary
This work addresses the challenge of safe and efficient navigation for autonomous mobile robots in dynamically crowded environments. The authors propose a novel approach that integrates Vision-Language Models (VLMs) with Gaussian Process Regression (GPR) to jointly perceive static obstacles and dynamic crowds. For the first time, a VLM is leveraged to recognize and abstract semantic concepts of crowd density, which are then fused with GPR to construct a probabilistic dynamic crowd density map. This map enables the robot to plan collision-free paths that account for both stationary obstacles and moving pedestrians. Extensive experiments conducted in real-world campus scenarios demonstrate that the proposed method significantly enhances navigation safety and environmental adaptability compared to existing approaches.
📝 Abstract
Autonomous mobile robots offer promising solutions for labor shortages and increased operational efficiency. However, navigating safely and effectively in dynamic environments, particularly crowded areas, remains challenging. This paper proposes a novel framework that integrates Vision-Language Models (VLM) and Gaussian Process Regression (GPR) to generate dynamic crowd-density maps ("Abstraction Maps") for autonomous robot navigation. Our approach utilizes VLM’s capability to recognize abstract environmental concepts, such as crowd densities, and represents them probabilistically via GPR. Experimental results from real-world trials on a university campus demonstrated that robots successfully generated routes avoiding both static obstacles and dynamic crowds, enhancing navigation safety and adaptability.