🤖 AI Summary
This work addresses the challenge of insufficient reliability in autonomous orbital collision avoidance within partially observable and sparsely monitored space environments. To this end, the authors propose a Transformer-based reinforcement learning framework that, for the first time, integrates the Transformer architecture into Partially Observable Markov Decision Process (POMDP) modeling. By leveraging the Transformer’s long-range temporal attention mechanism alongside a distance-dependent observation model, a sequential state estimator, and a custom-designed conjunction simulation environment, the method effectively handles high-noise, intermittent space observation data. Experimental results demonstrate that the proposed approach significantly outperforms conventional strategies under incomplete observational conditions, achieving more robust and reliable autonomous collision avoidance capabilities.
📝 Abstract
We introduce a Transformer-based Reinforcement Learning framework for autonomous orbital collision avoidance that explicitly models the effects of partial observability and imperfect monitoring in space operations. The framework combines a configurable encounter simulator, a distance-dependent observation model, and a sequential state estimator to represent uncertainty in relative motion. A central contribution of this work is the use of transformer-based Partially Observable Markov Decision Process (POMDP) architecture, which leverage long-range temporal attention to interpret noisy and intermittent observations more effectively than traditional architectures. This integration provides a foundation for training collision avoidance agents that can operate more reliably under imperfect monitoring environments.