🤖 AI Summary
To address the degradation of localization and mapping accuracy in visual SLAM caused by unknown moving objects in dynamic environments, this paper proposes a robust SLAM framework integrating semantic keypoint filtering and adaptive robust loss optimization. Our method innovatively employs an online-estimated Barron kernel function—whose shape parameter is dynamically adapted—to model residual distributions in real time. Coupled with a lightweight semantic keypoint filter, it effectively suppresses both known and unknown motion outliers without requiring predefined object categories. We integrate these components into the ORB-SLAM3 architecture and augment it with an online residual analysis mechanism. Evaluations on the TUM RGB-D, Bonn, and OpenLORIS datasets demonstrate that our approach reduces absolute trajectory error (ATE) RMSE by up to 25% while maintaining an average runtime speed of 27 FPS—achieving an outstanding trade-off between accuracy and real-time performance.
📝 Abstract
Visual SLAM in dynamic environments remains challenging, as several existing methods rely on semantic filtering that only handles known object classes, or use fixed robust kernels that cannot adapt to unknown moving objects, leading to degraded accuracy when they appear in the scene. We present VAR-SLAM (Visual Adaptive and Robust SLAM), an ORB-SLAM3-based system that combines a lightweight semantic keypoint filter to deal with known moving objects, with Barron's adaptive robust loss to handle unknown ones. The shape parameter of the robust kernel is estimated online from residuals, allowing the system to automatically adjust between Gaussian and heavy-tailed behavior. We evaluate VAR-SLAM on the TUM RGB-D, Bonn RGB-D Dynamic, and OpenLORIS datasets, which include both known and unknown moving objects. Results show improved trajectory accuracy and robustness over state-of-the-art baselines, achieving up to 25% lower ATE RMSE than NGD-SLAM on challenging sequences, while maintaining performance at 27 FPS on average.