๐ค AI Summary
This work exposes a critical vulnerability of CNN-based SLAM systems (e.g., GCN-SLAM) to black-box adversarial attacks. Addressing the lack of systematic robustness evaluation of feature detectors in prior work, we design and validate black-box adversarial perturbations at both RGB and depth image input levels, specifically targeting the feature detection module. Experiments on the TUM dataset reveal that moderate-strength RGB attacks cause tracking failure in 76% of frames, while depth-image attacks induce catastrophic system-level failures. Our key contributions are: (1) the first systematic demonstration of structural fragility in CNN-SLAM feature detectors under black-box settings; and (2) empirical evidence that depth modality is significantly more vulnerable than RGBโproviding crucial insights for secure multi-modal SLAM design. These findings underscore the urgent need for robustness-aware architectures in vision-based localization systems.
๐ Abstract
Continuous advancements in deep learning have led to significant progress in feature detection, resulting in enhanced accuracy in tasks like Simultaneous Localization and Mapping (SLAM). Nevertheless, the vulnerability of deep neural networks to adversarial attacks remains a challenge for their reliable deployment in applications, such as navigation of autonomous agents. Even though CNN-based SLAM algorithms are a growing area of research there is a notable absence of a comprehensive presentation and examination of adversarial attacks targeting CNN-based feature detectors, as part of a SLAM system. Our work introduces black-box adversarial perturbations applied to the RGB images fed into the GCN-SLAM algorithm. Our findings on the TUM dataset [30] reveal that even attacks of moderate scale can lead to tracking failure in as many as 76% of the frames. Moreover, our experiments highlight the catastrophic impact of attacking depth instead of RGB input images on the SLAM system.