🤖 AI Summary
To address the low detection accuracy and imprecise localization of high aspect-ratio objects (e.g., aircraft, ships, bridges) in remote sensing imagery under long-distance and complex background conditions, this paper proposes Strip R-CNN—a novel two-stage detector built upon large strip-shaped convolutions. Its key contributions are: (1) the introduction of orthogonal strip convolutions—replacing conventional large square kernels—to more efficiently capture the spatial structure of elongated objects; and (2) the first decoupling of the localization branch in the detection head to operate exclusively on strip-aligned features, significantly improving bounding box regression precision. Built upon the Faster R-CNN framework, Strip R-CNN achieves 82.75% mAP on DOTA-v1.0, establishing a new state-of-the-art. It also consistently outperforms existing methods across multiple benchmarks, including FAIR1M, HRSC2016, and DIOR.
📝 Abstract
While witnessed with rapid development, remote sensing object detection remains challenging for detecting high aspect ratio objects. This paper shows that large strip convolutions are good feature representation learners for remote sensing object detection and can detect objects of various aspect ratios well. Based on large strip convolutions, we build a new network architecture called Strip R-CNN, which is simple, efficient, and powerful. Unlike recent remote sensing object detectors that leverage large-kernel convolutions with square shapes, our Strip R-CNN takes advantage of sequential orthogonal large strip convolutions to capture spatial information. In addition, we enhance the localization capability of remote-sensing object detectors by decoupling the detection heads and equipping the localization head with strip convolutions to better localize the target objects. Extensive experiments on several benchmarks, e.g., DOTA, FAIR1M, HRSC2016, and DIOR, show that our Strip R-CNN can largely improve previous works. Notably, our 30M model achieves 82.75% mAP on DOTA-v1.0, setting a new state-of-the-art record.Code is available at https://github.com/YXB-NKU/Strip-R-CNN.