🤖 AI Summary
This work addresses the limited generalization of existing remote sensing agents, which rely on predefined tools and struggle to adapt to diverse data and tasks in open environments. To overcome this, we propose the first tool-creating agent framework for open Earth observation, departing from conventional tool-calling paradigms. Our approach enables end-to-end autonomous observation through adaptive workflow planning, dynamic tool generation, cross-domain knowledge integration, and pre-trained model orchestration. We introduce a novel agent architecture that supports dynamic tool creation and multi-stage integration, alongside OpenEarth-Bench—the first benchmark for evaluating agents in open-world Earth observation—on which we evaluate performance across 596 real-world tasks. Remarkably, our agent achieves performance comparable to tool-calling baselines that depend on 104 specialized tools, using only six foundational model-based tools, while demonstrating superior robustness.
📝 Abstract
Earth Observation (EO) is essential for perceiving dynamic land surface changes, yet deploying autonomous EO in open environments is hindered by the immense diversity of multi-source data and heterogeneous tasks. While remote sensing agents have emerged to streamline EO workflows, existing tool-calling agents are confined to closed environments. They rely on pre-defined tools and are restricted to narrow scope, limiting their generalization to the diverse data and tasks. To overcome these limitations, we introduce OpenEarth-Agent, the first tool-creation agent framework tailored for open-environment EO. Rather than calling predefined tools, OpenEarth-Agent employs adaptive workflow planning and tool creation to generalize to unseen data and tasks. This adaptability is bolstered by an open-ended integration of multi-stage tools and cross-domain knowledge bases, enabling robust execution in the entire EO pipeline across multiple application domains. To comprehensively evaluate EO agents in open environments, we propose OpenEarth-Bench, a novel benchmark comprising 596 real-world, full-pipeline cases across seven application domains, explicitly designed to assess agents' adaptive planning and tool creation capabilities. Only essential pre-trained model tools are provided in this benchmark, devoid of any other predefined task-specific tools. Extensive experiments demonstrate that OpenEarth-Agent successfully masters full-pipeline EO across multiple domains in the open environment. Notably, on the cross-benchmark Earth-Bench, our tool-creating agent equipped with 6 essential pre-trained models achieves performance comparable to tool-calling agents relying on 104 specialized tools, and significantly outperforms them when provided with the complete toolset. In several cases, the created tools exhibit superior robustness to data anomalies compared to human-engineered counterparts.