GEVO: Memory-Efficient Monocular Visual Odometry Using Gaussians

📅 2024-09-14

🏛️ IEEE Robotics and Automation Letters

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

Monocular visual SLAM with Gaussian Splatting (GS) suffers from prohibitive memory consumption and power overhead on mobile devices due to the need to cache historical keyframes. To address this, we propose GS-SLAM—the first online monocular SLAM framework that eliminates historical image storage entirely. Our method introduces two core innovations: (1) a differentiable rendering-based strategy for real-time Gaussian initialization and dynamic optimization, effectively suppressing map artifacts and mitigating quality degradation; and (2) a lightweight online map update mechanism that replaces explicit image storage with on-the-fly rendering for continuous mapping. Experiments across diverse scenes demonstrate state-of-the-art reconstruction fidelity, with peak memory usage reduced to merely ~58 MB—up to 94× lower than prior approaches. GS-SLAM is the first high-fidelity GS-SLAM system enabling real-time operation on resource-constrained platforms, including smartphones, VR headsets, and micro-robots.

Technology Category

Application Category

📝 Abstract

Constructing a high-fidelity representation of the 3D scene using a monocular camera can enable a wide range of applications on mobile devices, such as micro-robots, smartphones, and AR/VR headsets. On these devices, memory is often limited in capacity and its access often dominates the consumption of compute energy. Although Gaussian Splatting (GS) allows for high-fidelity reconstruction of 3D scenes, current GS-based SLAM is not memory efficient as a large number of past images is stored to retrain Gaussians for reducing catastrophic forgetting. These images often require two-orders-of-magnitude higher memory than the map itself and thus dominate the total memory usage. In this work, we present GEVO, a GS-based monocular SLAM framework that achieves comparable fidelity as prior methods by rendering (instead of storing) them from the existing map. Novel Gaussian initialization and optimization techniques are proposed to remove artifacts from the map and delay the degradation of the rendered images over time. Across a variety of environments, GEVO achieves comparable map fidelity while reducing the memory overhead to around 58 MBs, which is up to 94x lower than prior works.

Problem

Research questions and friction points this paper is trying to address.

Monocular Depth Estimation

Memory Efficiency

Power Consumption

Innovation

Methods, ideas, or system contributions that make the work stand out.

GEVO

Memory-efficient

3D Mapping

🔎 Similar Papers

LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry