HMR-Net: Hierarchical Modular Routing for Cross-Domain Object Detection in Aerial Images

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This work addresses the limited generalization capability of object detection in aerial imagery caused by discrepancies in spatial resolution, scene composition, and semantic labeling across domains. To tackle this challenge, the authors propose a hierarchical modular routing framework that integrates geospatial-aware global routing with category-semantic-conditioned expert modules. By jointly leveraging global expert allocation and local scene decomposition, the method enables dual specialization—across datasets and within individual scenes—and supports zero-shot detection of novel categories without fine-tuning. Extensive experiments on four aerial image datasets demonstrate substantial improvements in multi-domain generalization, region-specific detection accuracy, and open-category recognition performance.

Technology Category

Application Category

📝 Abstract
Despite advances in object detection, aerial imagery remains a challenging domain, as models often fail to generalize across variations in spatial resolution, scene composition, and semantic label coverage. Differences in geographic context, sensor characteristics, and object distributions across datasets limit the capacity of conventional models to learn consistent and transferable representations. Shared methods trained on such data tend to impose a unified representation across fundamentally different domains, resulting in poor performance on region-specific content and less flexibility when dealing with novel object categories. To address this, we propose a novel modular learning framework that enables structured specialization in aerial detection. Our method introduces a hierarchical routing mechanism with two levels of modularity: a global expert assignment layer that uses latent geographic embeddings to route datasets to specialized processing modules, and a local scene decomposition mechanism that allocates image subregions to region-specific sub-modules. This allows our method to specialize across datasets and within complex scenes. Additionally, the framework contains a conditional expert module that uses external semantic information (e.g., category names or textual descriptions) to enable detection of novel object categories during inference, without the need for retraining or fine-tuning. By moving beyond monolithic representations, our method offers an adaptive framework for remote sensing object detection. Comprehensive evaluations on four datasets highlight improvements in multi-dataset generalization, regional specialization, and open-category detection.
Problem

Research questions and friction points this paper is trying to address.

cross-domain object detection
aerial images
domain generalization
novel category detection
remote sensing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Modular Routing
Cross-Domain Object Detection
Geographic Embedding
Conditional Expert Module
Open-Category Detection
🔎 Similar Papers
No similar papers found.