🤖 AI Summary
This study addresses the real-time computational bottleneck of the traditional SGP4 algorithm when propagating orbits for ultra-large satellite constellations exceeding 100,000 spacecraft. To overcome this limitation, the authors introduce JAX into the SGP4 framework for the first time, reformulating the algorithm using functional programming principles and leveraging just-in-time (JIT) compilation, automatic vectorization, and GPU acceleration to enable highly efficient parallel computation. On a single NVIDIA A100 GPU, the proposed implementation propagates 1,000 time steps for each of 9,341 Starlink satellites in only 4 milliseconds—achieving an approximately 1,500-fold speedup over conventional C++ implementations. The work also demonstrates that single-precision (32-bit) floating-point arithmetic is both sufficiently accurate and computationally advantageous for orbit propagation in such large-scale scenarios.
📝 Abstract
As the population of anthropogenic space objects transitions from sparse clusters to mega-constellations exceeding 100,000 satellites, traditional orbital propagation techniques face a critical bottleneck. Standard CPU-bound implementations of the Simplified General Perturbations 4 (SGP4) algorithm are less well suited to handle the requisite scale of collision avoidance and Space Situational Awareness (SSA) tasks. This paper introduces \texttt{jaxsgp4}, an open-source high-performance reimplementation of SGP4 utilising the \texttt{JAX} library. \texttt{JAX} has gained traction in the landscape of computational research, offering an easy mechanism for Just-In-Time (JIT) compilation, automatic vectorisation and automatic optimisation of code for CPU, GPU and TPU hardware modalities. By refactoring the algorithm into a pure functional paradigm, we leverage these transformations to execute massively parallel propagations on modern GPUs. We demonstrate that \texttt{jaxsgp4} can propagate the entire Starlink constellation (9,341 satellites) each to 1,000 future time steps in under 4 ms on a single A100 GPU, representing a speedup of $1500\times$ over traditional C++ baselines. Furthermore, we argue that the use of 32-bit precision for SGP4 propagation tasks offers a principled trade-off, sacrificing negligible precision loss for a substantial gain in throughput on hardware accelerators.