🤖 AI Summary
Earth observation satellite scheduling faces significant challenges, including severe task over-subscription and tightly coupled constraints—such as visibility windows and slew-time delays—necessitating joint optimization of task selection and temporal sequencing. To address this, we propose the first end-to-end framework integrating Graph Neural Networks (GNNs) with Deep Reinforcement Learning (DRL). Our method constructs a task-dependency graph to explicitly encode spatiotemporal constraints and designs a sequential decision-making policy network that jointly optimizes observation selection and scheduling. Unlike conventional approaches, it requires no hand-crafted rules or multi-stage decomposition. Evaluated on realistic-scale instances, it achieves weighted reward performance on par with or exceeding state-of-the-art heuristics and search-based algorithms. The model trains efficiently and exhibits strong generalization—from few-shot training to large-scale operational scenarios—demonstrating robust scalability and adaptability.
📝 Abstract
The Earth Observation Satellite Planning (EOSP) is a difficult optimization problem with considerable practical interest. A set of requested observations must be scheduled on an agile Earth observation satellite while respecting constraints on their visibility window, as well as maneuver constraints that impose varying delays between successive observations. In addition, the problem is largely oversubscribed: there are much more candidate observations than what can possibly be achieved. Therefore, one must select the set of observations that will be performed while maximizing their weighted cumulative benefit, and propose a feasible schedule for these observations. As previous work mostly focused on heuristic and iterative search algorithms, this paper presents a new technique for selecting and scheduling observations based on Graph Neural Networks (GNNs) and Deep Reinforcement Learning (DRL). GNNs are used to extract relevant information from the graphs representing instances of the EOSP, and DRL drives the search for optimal schedules. Our simulations show that it is able to learn on small problem instances and generalize to larger real-world instances, with very competitive performance compared to traditional approaches.