🤖 AI Summary
Accurate river delineation in narrow, high-sediment rivers remains challenging in low-resolution remote sensing imagery. Method: We introduce the first global high-resolution, multi-sensor-aligned river mask dataset—comprising 1,145 PlanetScope images covering 2,577 km², with >100 hours of manual annotation—and propose a hybrid CNN-Transformer architecture featuring learnable adapters for adaptive multispectral feature fusion, coupled with joint supervised and self-supervised pretraining. Contribution/Results: Our method achieves a median error of only 7.2 m on a newly released high-precision river width benchmark, substantially outperforming existing remote sensing approaches. It enables rigorous cross-sensor accuracy–cost trade-off analysis and provides both a reliable technical foundation and open-source data resource for fine-scale surface water dynamics monitoring.
📝 Abstract
Surface water dynamics play a critical role in Earth's climate system, influencing ecosystems, agriculture, disaster resilience, and sustainable development. Yet monitoring rivers and surface water at fine spatial and temporal scales remains challenging -- especially for narrow or sediment-rich rivers that are poorly captured by low-resolution satellite data. To address this, we introduce RiverScope, a high-resolution dataset developed through collaboration between computer science and hydrology experts. RiverScope comprises 1,145 high-resolution images (covering 2,577 square kilometers) with expert-labeled river and surface water masks, requiring over 100 hours of manual annotation. Each image is co-registered with Sentinel-2, SWOT, and the SWOT River Database (SWORD), enabling the evaluation of cost-accuracy trade-offs across sensors -- a key consideration for operational water monitoring. We also establish the first global, high-resolution benchmark for river width estimation, achieving a median error of 7.2 meters -- significantly outperforming existing satellite-derived methods. We extensively evaluate deep networks across multiple architectures (e.g., CNNs and transformers), pretraining strategies (e.g., supervised and self-supervised), and training datasets (e.g., ImageNet and satellite imagery). Our best-performing models combine the benefits of transfer learning with the use of all the multispectral PlanetScope channels via learned adaptors. RiverScope provides a valuable resource for fine-scale and multi-sensor hydrological modeling, supporting climate adaptation and sustainable water management.