🤖 AI Summary
To address the limited generalizability of lunar surface perception models caused by scarcity of real-world annotated data, this paper introduces POLAR-E: the first enhanced multimodal dataset specifically designed for lunar environments. Methodologically, we provide full-scale annotations—including bounding boxes and pixel-level semantic segmentation—for all 23,000 rock/shadow/crater instances in NASA’s POLAR dataset; further, we construct high-fidelity digital twin scenes covering 13 terrain types, integrating geometric modeling, physically based rendering (PBR) materials, and physics-driven illumination to enable parameterized synthesis of images with precise semantic labels under arbitrary viewpoints, lighting conditions, and exposure settings. Our contributions are threefold: (1) the first comprehensive, fine-grained semantic annotation suite for lunar surface imagery; (2) the first scalable, physically realistic digital twin framework for lunar scene generation; and (3) full open-sourcing of both data and models, significantly enhancing training scale, data diversity, and robustness evaluation under sparse-data regimes.
📝 Abstract
NASA's POLAR dataset contains approximately 2,600 pairs of high dynamic range stereo photos captured across 13 varied terrain scenarios, including areas with sparse or dense rock distributions, craters, and rocks of different sizes. The purpose of these photos is to spur development in robotics, AI-based perception, and autonomous navigation. Acknowledging a scarcity of lunar images from around the lunar poles, NASA Ames produced on Earth but in controlled conditions images that resemble rover operating conditions from these regions of the Moon. We report on the outcomes of an effort aimed at accomplishing two tasks. In Task 1, we provided bounding boxes and semantic segmentation information for all the images in NASA's POLAR dataset. This effort resulted in 23,000 labels and semantic segmentation annotations pertaining to rocks, shadows, and craters. In Task 2, we generated the digital twins of the 13 scenarios that have been used to produce all the photos in the POLAR dataset. Specifically, for each of these scenarios, we produced individual meshes, texture information, and material properties associated with the ground and the rocks in each scenario. This allows anyone with a camera model to synthesize images associated with any of the 13 scenarios of the POLAR dataset. Effectively, one can generate as many semantically labeled synthetic images as desired -- with different locations and exposure values in the scene, for different positions of the sun, with or without the presence of active illumination, etc. The benefit of this work is twofold. Using outcomes of Task 1, one can train and/or test perception algorithms that deal with Moon images. For Task 2, one can produce as much data as desired to train and test AI algorithms that are anticipated to work in lunar conditions. All the outcomes of this work are available in a public repository for unfettered use and distribution.