đ¤ AI Summary
Service robots face significant challenges in real-time perception of door existence and passability within unstructured indoor environments, hindering reliable inference of dynamic topological maps. Method: This paper proposes a two-stage domain adaptation framework: (1) photorealistic synthetic data generation from robot-centric viewpoints, and (2) domain-specific qualificationâa fine-tuning and robustness validation mechanismâto bridge the sim-to-real gap. Contribution/Results: The approach achieves high data efficiency and deployment reliability, overcoming the generalization limitations of generic vision models in long-horizon real-world scenarios. Extensive real-world evaluations across diverse indoor environments demonstrate that the method meets stringent requirements for long-term robotic deploymentâachieving high door detection accuracy (>92%) and real-time inference speed (<50 ms per frame)âwhile significantly reducing domain adaptation cost. Moreover, it robustly supports higher-level tasks such as environmental topology tracking and update.
đ Abstract
Autonomous service robots are becoming increasingly common in humanâcentric, longâterm deployments in unstructured indoor environments. Robotic vision is a crucial capability, enabling robots to perceive and interpret highâlevel environmental features from visual input. While dataâdriven approaches based on deep learning have advanced the capabilities of vision systems, applying these techniques in real robotic scenarios still presents unique methodological challenges. Conventional datasets often do not represent the object categories that a service robot needs to detect. More importantly, stateâofâtheâart models struggle to address the demanding perception constraints faced by service robots, posing the need for adaptations to the specific environments in which the robots operate. We devise a method that addresses these challenges by leveraging photorealistic simulations to create synthetic visual datasets from a robot's perspective. This approach balances data quality with acquisition costs, enabling the training of deep, generalâpurpose detectors tailored for service robots. We then demonstrate the benefits of qualifying a general detector for the domain in which the robot is deployed, studying the tradeâoff between dataâacquisition efforts and performance improvement. We evaluate our method using a representative selection of prominent deepâlearning object detectors for the challenge of recognizing, in real time, the presence and traversability of doorways. This task, which we refer to as door detection, is fundamental to numerous significant robotic tasks, such as tracking the changing topology of dynamic environments. We conduct an extensive experimental campaign in the field, considering different realâworld setups while emulating the typical challenges encountered in longâterm deployments of service robots. Our key findings demonstrate that simulation and qualification techniques can significantly reduce costs associated with domain adaptation for service robots. While simulation allows embedding the robot's perspective during the training of endâtoâend robotic vision modules, qualification is essential to improve their robustness over challenging detection instances, thus reaching the performance level typically required by realistic longâterm deployments of service robots.