Hybrid guided variational autoencoder for visual place recognition

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

Existing visual place recognition methods struggle to balance memory efficiency, robustness, and generalization, limiting their deployment on resource-constrained mobile platforms. This work proposes a novel architecture that integrates event cameras with a Guided Variational Autoencoder (Guided VAE), employing a spiking neural network as the encoder to achieve low-power, compact, and illumination-invariant indoor place recognition. The approach is compatible with neuromorphic hardware and demonstrates strong performance on a newly curated indoor VPR dataset, accurately distinguishing among 16 distinct locations with accuracy comparable to state-of-the-art methods. Moreover, it exhibits excellent generalization capabilities under unseen environments and varying lighting conditions, highlighting its potential for real-world deployment in dynamic and resource-limited settings.

Technology Category

Application Category

📝 Abstract

Autonomous agents such as cars, robots and drones need to precisely localize themselves in diverse environments, including in GPS-denied indoor environments. One approach for precise localization is visual place recognition (VPR), which estimates the place of an image based on previously seen places. State-of-the-art VPR models require high amounts of memory, making them unwieldy for mobile deployment, while more compact models lack robustness and generalization capabilities. This work overcomes these limitations for robotics using a combination of event-based vision sensors and an event-based novel guided variational autoencoder (VAE). The encoder part of our model is based on a spiking neural network model which is compatible with power-efficient low latency neuromorphic hardware. The VAE successfully disentangles the visual features of 16 distinct places in our new indoor VPR dataset with a classification performance comparable to other state-of-the-art approaches while, showing robust performance also under various illumination conditions. When tested with novel visual inputs from unknown scenes, our model can distinguish between these places, which demonstrates a high generalization capability by learning the essential features of location. Our compact and robust guided VAE with generalization capabilities poses a promising model for visual place recognition that can significantly enhance mobile robot navigation in known and unknown indoor environments.

Problem

Research questions and friction points this paper is trying to address.

visual place recognition

localization

robustness

generalization

mobile deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

guided variational autoencoder

event-based vision

spiking neural network