A Multi-Agent Feedback System for Detecting and Describing News Events in Satellite Imagery

📅 2026-04-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

179K/year
🤖 AI Summary
Existing approaches struggle to efficiently construct multi-temporal remote sensing event captioning datasets due to the high cost of manually identifying visible events and annotating corresponding image sequences. To address this challenge, this work proposes SkyScraper, an iterative multi-agent feedback system that automatically discovers and annotates remote sensing events by geocoding news articles, retrieving matching satellite image sequences, and generating descriptive image-text pairs. The proposed method substantially improves event discovery efficiency, yielding five times more events than conventional approaches, and enables the creation of the first large-scale multi-temporal remote sensing event captioning dataset, comprising 5,000 annotated sequences. This resource provides a critical foundation for interdisciplinary applications bridging remote sensing and news analysis.

Technology Category

Application Category

📝 Abstract
Changes in satellite imagery often occur over multiple time steps. Despite the emergence of bi-temporal change captioning datasets, there is a lack of multi-temporal event captioning datasets (at least two images per sequence) in remote sensing. This gap exists because (1) searching for visible events in satellite imagery and (2) labeling multi-temporal sequences require significant time and labor. To address these challenges, we present SkyScraper, an iterative multi-agent workflow that geocodes news articles and synthesizes captions for corresponding satellite image sequences. Our experiments show that SkyScraper successfully finds 5x more events than traditional geocoding methods, demonstrating that agentic feedback is an effective strategy for surfacing new multi-temporal events in satellite imagery. We apply our framework to a large database of global news articles, curating a new multi-temporal captioning dataset with 5,000 sequences. By automatically identifying imagery related to news events, our work also supports journalism and reporting efforts.
Problem

Research questions and friction points this paper is trying to address.

multi-temporal
satellite imagery
event captioning
remote sensing
dataset
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent feedback
multi-temporal captioning
satellite imagery
geocoding
event detection
🔎 Similar Papers