🤖 AI Summary
This work addresses real-time, map-free self-localization on planar graphs (e.g., indoor floorplans) without retraining or large-scale image databases. The proposed data-driven, lightweight method operates solely on a generic monocular depth estimator and the floorplan. Its core contributions are: (1) a novel ray-based observation model that fuses single- and multi-view depth predictions, eliminating reliance on image upright orientation; and (2) a temporal-aware recursive Bayesian filtering module for efficient observation fusion and state update. The approach runs in real time on consumer-grade hardware. Evaluated on standard benchmarks, it significantly outperforms state-of-the-art methods [20, 28] in localization accuracy and robustness—especially under viewpoint and lighting variations—while requiring minimal deployment overhead and no domain-specific training.
📝 Abstract
In this paper we propose an efficient data-driven solution to self-localization within a floorplan. Floorplan data is readily available, long-term persistent and inherently robust to changes in the visual appearance. Our method does not require retraining per map and location or demand a large database of images of the area of interest. We propose a novel probabilistic model consisting of an observation and a novel temporal filtering module. Operating internally with an efficient ray-based representation, the observation module consists of a single and a multiview module to predict horizontal depth from images and fuses their results to benefit from advantages offered by either methodology. Our method operates on conventional consumer hardware and overcomes a common limitation of competing methods [16], [17], [20], [28] that often demand upright images. Our full system meets real-time requirements, while outperforming the state-of-the-art [20], [28] by a significant margin.