π€ AI Summary
To address unreliable localization in robot navigation caused by uneven viewpoint information content, this paper proposes ActLocβthe first localization-enhancement framework for active viewpoint selection. Our method introduces a large-scale attention model to jointly encode map features and camera pose history, enabling prediction of 3D spatial localization confidence distributions across yaw and pitch angles. This confidence distribution is then integrated into global path planning to enable task-driven active viewpoint optimization and trajectory-level collaborative decision-making. The approach unifies map encoding, pose modeling, and attention-driven perception-planning co-optimization. Evaluated on both single-viewpoint selection and full-trajectory planning tasks, ActLoc achieves state-of-the-art performance, demonstrates strong generalization across diverse environments, and features a modular architecture compatible with various autonomous navigation and inspection scenarios.
π Abstract
Reliable localization is critical for robot navigation, yet most existing systems implicitly assume that all viewing directions at a location are equally informative. In practice, localization becomes unreliable when the robot observes unmapped, ambiguous, or uninformative regions. To address this, we present ActLoc, an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks. At its core, ActLoc employs a largescale trained attention-based model for viewpoint selection. The model encodes a metric map and the camera poses used during map construction, and predicts localization accuracy across yaw and pitch directions at arbitrary 3D locations. These per-point accuracy distributions are incorporated into a path planner, enabling the robot to actively select camera orientations that maximize localization robustness while respecting task and motion constraints. ActLoc achieves stateof-the-art results on single-viewpoint selection and generalizes effectively to fulltrajectory planning. Its modular design makes it readily applicable to diverse robot navigation and inspection tasks.