🤖 AI Summary
Lunar surface operations lack conventional navigation infrastructure (e.g., GPS), and existing visual-inertial odometry (VIO) systems suffer from significant drift over long distances and across multiple sessions, failing to meet the high-precision global localization requirements of in-situ resource utilization (ISRU) missions. To address this, we propose a zero-drift cross-session localization framework: (1) a novel zero-shot instance segmentation–based method for extracting rock landmarks without manual annotation, enabling structured terrain map construction; (2) integration of stereo vision, graph neural representations, and robust graph matching to automatically align maps across sessions. Evaluated in a realistic lunar analog environment, our framework achieves sub-centimeter cross-session localization accuracy—substantially outperforming state-of-the-art approaches. Furthermore, we publicly release a high-quality multimodal dataset and a reproducible playback toolkit, establishing a new paradigm for lunar surface autonomous navigation.
📝 Abstract
Global localization is necessary for autonomous operations on the lunar surface where traditional Earth-based navigation infrastructure, such as GPS, is unavailable. As NASA advances toward sustained lunar presence under the Artemis program, autonomous operations will be an essential component of tasks such as robotic exploration and infrastructure deployment. Tasks such as excavation and transport of regolith require precise pose estimation, but proposed approaches such as visual-inertial odometry (VIO) accumulate odometry drift over long traverses. Precise pose estimation is particularly important for upcoming missions such as the ISRU Pilot Excavator (IPEx) that rely on autonomous agents to operate over extended timescales and varied terrain. To help overcome odometry drift over long traverses, we propose LunarLoc, an approach to global localization that leverages instance segmentation for zero-shot extraction of boulder landmarks from onboard stereo imagery. Segment detections are used to construct a graph-based representation of the terrain, which is then aligned with a reference map of the environment captured during a previous session using graph-theoretic data association. This method enables accurate and drift-free global localization in visually ambiguous settings. LunarLoc achieves sub-cm level accuracy in multi-session global localization experiments, significantly outperforming the state of the art in lunar global localization. To encourage the development of further methods for global localization on the Moon, we release our datasets publicly with a playback module: https://github.com/mit-acl/lunarloc-data.