ShelfAware: Real-Time Visual-Inertial Semantic Localization in Quasi-Static Environments with Low-Cost Sensors

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

In quasi-static indoor environments (e.g., retail stores), visual localization often fails due to geometric repetition and dynamic semantic changes. To address this, we propose a semantic particle filter framework enabling robust global localization using only a low-cost monocular camera and visual-inertial odometry (VIO). Our key contributions are: (1) modeling spatial distributions per semantic class—first of its kind—and introducing an inverse semantic proposal mechanism to guide hypothesis generation in Monte Carlo Localization (MCL), effectively mitigating geometric aliasing and semantic drift; (2) integrating depth-aware likelihood, category-centered semantic similarity, and a precomputed semantic viewpoint library. Experiments under four highly challenging conditions show our method achieves 96% global localization success (vs. <22% for baselines), converges in 1.91 s on average, attains the lowest translation RMSE, maintains stable tracking in 80% of sequences, and runs in real time on consumer-grade laptops.

Technology Category

Application Category

📝 Abstract

Many indoor workspaces are quasi-static: global layout is stable but local semantics change continually, producing repetitive geometry, dynamic clutter, and perceptual noise that defeat vision-based localization. We present ShelfAware, a semantic particle filter for robust global localization that treats scene semantics as statistical evidence over object categories rather than fixed landmarks. ShelfAware fuses a depth likelihood with a category-centric semantic similarity and uses a precomputed bank of semantic viewpoints to perform inverse semantic proposals inside MCL, yielding fast, targeted hypothesis generation on low-cost, vision-only hardware. Across 100 global-localization trials spanning four conditions (cart-mounted, wearable, dynamic obstacles, and sparse semantics) in a semantically dense, retail environment, ShelfAware achieves a 96% success rate (vs. 22% MCL and 10% AMCL) with a mean time-to-convergence of 1.91s, attains the lowest translational RMSE in all conditions, and maintains stable tracking in 80% of tested sequences, all while running in real time on a consumer laptop-class platform. By modeling semantics distributionally at the category level and leveraging inverse proposals, ShelfAware resolves geometric aliasing and semantic drift common to quasi-static domains. Because the method requires only vision sensors and VIO, it integrates as an infrastructure-free building block for mobile robots in warehouses, labs, and retail settings; as a representative application, it also supports the creation of assistive devices providing start-anytime, shared-control assistive navigation for people with visual impairments.

Problem

Research questions and friction points this paper is trying to address.

Robust localization in quasi-static environments with repetitive geometry and dynamic clutter

Resolving geometric aliasing and semantic drift using category-level semantic modeling

Enabling infrastructure-free mobile robot navigation with low-cost vision-only sensors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses semantic particle filter for robust global localization

Fuses depth likelihood with category-centric semantic similarity

Employs inverse semantic proposals for fast hypothesis generation

🔎 Similar Papers

F3Loc: Fusion and Filtering for Floorplan Localization