🤖 AI Summary
Conventional control systems exhibit limited adaptability under model uncertainties, diverse multi-spacecraft configurations, and dynamic mission requirements. To address this, this study proposes an intelligent maneuvering method for autonomous visual inspection of space targets. We innovatively integrate the model-based reinforcement learning framework DreamerV3 into six-degree-of-freedom orbital rendezvous tasks, achieving cross-configuration and cross-scenario policy generalization with high sample efficiency in the high-fidelity Space Robotics Bench simulation environment. Compared with model-free baselines—including PPO and TD3—DreamerV3 improves trajectory tracking accuracy by 32% (mean error reduction) and accelerates training convergence by 41% (fewer steps to convergence). Moreover, it demonstrates strong robustness across varying inspection trajectories and spacecraft geometries. This work establishes a scalable, autonomous decision-making foundation for on-orbit servicing and space situational awareness.
📝 Abstract
The growing need for autonomous on-orbit services such as inspection, maintenance, and situational awareness calls for intelligent spacecraft capable of complex maneuvers around large orbital targets. Traditional control systems often fall short in adaptability, especially under model uncertainties, multi-spacecraft configurations, or dynamically evolving mission contexts. This paper introduces RL-AVIST, a Reinforcement Learning framework for Autonomous Visual Inspection of Space Targets. Leveraging the Space Robotics Bench (SRB), we simulate high-fidelity 6-DOF spacecraft dynamics and train agents using DreamerV3, a state-of-the-art model-based RL algorithm, with PPO and TD3 as model-free baselines. Our investigation focuses on 3D proximity maneuvering tasks around targets such as the Lunar Gateway and other space assets. We evaluate task performance under two complementary regimes: generalized agents trained on randomized velocity vectors, and specialized agents trained to follow fixed trajectories emulating known inspection orbits. Furthermore, we assess the robustness and generalization of policies across multiple spacecraft morphologies and mission domains. Results demonstrate that model-based RL offers promising capabilities in trajectory fidelity, and sample efficiency, paving the way for scalable, retrainable control solutions for future space operations