UniGeoSeg: Towards Unified Open-World Segmentation for Geospatial Scenes

📅 2025-11-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the dual challenges of fragmented task definitions and scarce instruction-based data in remote sensing image segmentation, this paper proposes UniGeoSeg, a unified modeling framework. Methodologically, it introduces (1) GeoSeg-1M—the first million-scale instruction-driven remote sensing segmentation dataset, encompassing diverse geographic objects and complex real-world scenes; (2) GeoSeg-Bench, a comprehensive benchmark enabling open-world segmentation evaluation; and (3) a novel architecture integrating task-aware textual augmentation, latent knowledge memory, and progressive multi-task training. Extensive experiments on GeoSeg-Bench and multiple public benchmarks demonstrate that UniGeoSeg significantly outperforms state-of-the-art methods, achieving superior zero-shot transferability and fine-grained understanding of geographic context. The framework establishes a new foundation for scalable, instruction-driven segmentation in remote sensing.

Technology Category

Application Category

📝 Abstract

Instruction-driven segmentation in remote sensing generates masks from guidance, offering great potential for accessible and generalizable applications. However, existing methods suffer from fragmented task formulations and limited instruction data, hindering effective understanding and generalization. To address these issues, we introduce GeoSeg-1M, the first million-scale dataset for remote sensing instruction-driven segmentation, constructed via an automatic mask filtering and instruction generation pipeline that synthesizes referring, interactive, and reasoning segmentation instructions from multiple public datasets. GeoSeg-1M contains 590K images, 117 categories, and 1.1M image-mask-instruction triplets. Building upon this foundation, we further curate GeoSeg-Bench, a challenging benchmark designed to evaluate contextual understanding and reasoning capabilities across diverse instruction-driven tasks and complex geospatial scenes. Furthermore, we present UniGeoSeg, a unified framework that serves as a strong baseline, incorporating task-aware text enhancement, latent knowledge memory, and a progressive training strategy to facilitate multi-task learning. Extensive experiments demonstrate the state-of-the-art performance of UniGeoSeg across GeoSeg-Bench and diverse public benchmarks, while exhibiting strong zero-shot generalization. Datasets and source code were released at https://github.com/MiliLab/UniGeoSeg.

Problem

Research questions and friction points this paper is trying to address.

Develops a unified framework for open-world geospatial segmentation tasks

Addresses fragmented task formulations and limited instruction data issues

Introduces a million-scale dataset and benchmark for evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Created million-scale dataset GeoSeg-1M with automatic instruction generation

Introduced unified framework UniGeoSeg with task-aware text enhancement and memory

Proposed challenging benchmark GeoSeg-Bench for evaluating contextual reasoning

🔎 Similar Papers

No similar papers found.

Authors to Follow