🤖 AI Summary
Scheduling time-critical follow-up observations for astronomical Targets of Opportunity (ToOs) under online, resource-constrained conditions—where telescope arrays must dynamically allocate limited resources and plan temporally sensitive tracking sequences in real time—remains a significant challenge. Method: We propose a deep reinforcement learning framework that models task dependencies as a directed acyclic graph (DAG) and introduces an online local rewriting strategy to circumvent the prohibitive computational cost of global schedule reoptimization. Contribution/Results: Trained and evaluated in a high-fidelity, astronomy-specific simulation environment, our method significantly outperforms five state-of-the-art heuristic schedulers under realistic ToO scenarios. It demonstrates strong generalization across diverse observational configurations (e.g., varying telescope numbers, field-of-view sizes, and scheduling horizons) and supports hindsight learning for continual performance improvement. The approach enables scalable, real-time decision-making without sacrificing scheduling quality or temporal feasibility.
📝 Abstract
In the astronomical observation field, determining the allocation of observation resources of the telescope array and planning follow-up observations for targets of opportunity (ToOs) are indispensable components of astronomical scientific discovery. This problem is computationally challenging, given the online observation setting and the abundance of time-varying factors that can affect whether an observation can be conducted. This paper presents ROARS, a reinforcement learning approach for online astronomical resource-constrained scheduling. To capture the structure of the astronomical observation scheduling, we depict every schedule using a directed acyclic graph (DAG), illustrating the dependency of timing between different observation tasks within the schedule. Deep reinforcement learning is used to learn a policy that can improve the feasible solution by iteratively local rewriting until convergence. It can solve the challenge of obtaining a complete solution directly from scratch in astronomical observation scenarios, due to the high computational complexity resulting from numerous spatial and temporal constraints. A simulation environment is developed based on real-world scenarios for experiments, to evaluate the effectiveness of our proposed scheduling approach. The experimental results show that ROARS surpasses 5 popular heuristics, adapts to various observation scenarios and learns effective strategies with hindsight.