Multi-Agent Reinforcement Learning for Autonomous Multi-Satellite Earth Observation: A Realistic Case Study

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Autonomous collaborative Earth observation by low-Earth-orbit (LEO) satellite constellations faces challenges arising from dynamic orbital environments, severe resource constraints (e.g., energy and onboard storage), and partial observability. Method: We establish a near-realistic multi-satellite dynamics and mission simulation environment, and conduct the first systematic evaluation of MARL algorithms—including PPO, IPPO, MAPPO, and HAPPO—under decentralized, partially observable Markov games. To address non-stationarity and reward coupling inherent in Earth observation tasks, we propose a training stabilization strategy tailored to these characteristics. Contribution/Results: Our approach significantly improves multi-satellite collaborative imaging efficiency and onboard resource utilization while maintaining high task performance and enhancing system robustness. It delivers a deployable, real-time, on-orbit autonomous decision-making solution grounded in MARL, validated under realistic operational constraints.

Technology Category

Application Category

📝 Abstract
The exponential growth of Low Earth Orbit (LEO) satellites has revolutionised Earth Observation (EO) missions, addressing challenges in climate monitoring, disaster management, and more. However, autonomous coordination in multi-satellite systems remains a fundamental challenge. Traditional optimisation approaches struggle to handle the real-time decision-making demands of dynamic EO missions, necessitating the use of Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL). In this paper, we investigate RL-based autonomous EO mission planning by modelling single-satellite operations and extending to multi-satellite constellations using MARL frameworks. We address key challenges, including energy and data storage limitations, uncertainties in satellite observations, and the complexities of decentralised coordination under partial observability. By leveraging a near-realistic satellite simulation environment, we evaluate the training stability and performance of state-of-the-art MARL algorithms, including PPO, IPPO, MAPPO, and HAPPO. Our results demonstrate that MARL can effectively balance imaging and resource management while addressing non-stationarity and reward interdependency in multi-satellite coordination. The insights gained from this study provide a foundation for autonomous satellite operations, offering practical guidelines for improving policy learning in decentralised EO missions.
Problem

Research questions and friction points this paper is trying to address.

Autonomous coordination in multi-satellite Earth Observation systems
Real-time decision-making for dynamic satellite missions using MARL
Balancing imaging and resource management under partial observability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Multi-Agent Reinforcement Learning (MARL)
Simulates near-realistic satellite environments
Evaluates PPO, IPPO, MAPPO, HAPPO algorithms
🔎 Similar Papers
No similar papers found.