Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Maritime object detection faces two key challenges: scarcity of annotated data and poor generalization across diverse scenarios—particularly underrepresented ones such as open-sea environments. To address these, we propose a generation-selection collaborative synthetic data augmentation framework. First, we design an X-to-Maritime multimodal diffusion model incorporating bidirectional object-water attention to synthesize diverse, high-fidelity maritime scene images. Second, we introduce an attribute-correlated active sampling strategy that enables task-aware selection of high-quality synthetic samples. The framework end-to-end integrates multimodal conditional generation, attention-based modeling, and active learning. Evaluated on data-scarce open-sea scenarios, our method achieves significant improvements in detection accuracy—up to 12.7% mAP gain over baselines—and establishes a new benchmark for generative vision learning in maritime domains.

Technology Category

Application Category

📝 Abstract
Maritime object detection is essential for navigation safety, surveillance, and autonomous operations, yet constrained by two key challenges: the scarcity of annotated maritime data and poor generalization across various maritime attributes (e.g., object category, viewpoint, location, and imaging environment). % In particular, models trained on existing datasets often underperform in underrepresented scenarios such as open-sea environments. To address these challenges, we propose Neptune-X, a data-centric generative-selection framework that enhances training effectiveness by leveraging synthetic data generation with task-aware sample selection. From the generation perspective, we develop X-to-Maritime, a multi-modality-conditioned generative model that synthesizes diverse and realistic maritime scenes. A key component is the Bidirectional Object-Water Attention module, which captures boundary interactions between objects and their aquatic surroundings to improve visual fidelity. To further improve downstream tasking performance, we propose Attribute-correlated Active Sampling, which dynamically selects synthetic samples based on their task relevance. To support robust benchmarking, we construct the Maritime Generation Dataset, the first dataset tailored for generative maritime learning, encompassing a wide range of semantic conditions. Extensive experiments demonstrate that our approach sets a new benchmark in maritime scene synthesis, significantly improving detection accuracy, particularly in challenging and previously underrepresented settings.The code is available at https://github.com/gy65896/Neptune-X.
Problem

Research questions and friction points this paper is trying to address.

Addressing maritime object detection challenges due to scarce annotated data
Improving poor generalization across diverse maritime attributes and environments
Enhancing detection performance in underrepresented scenarios like open-sea settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modality-conditioned generative model synthesizes maritime scenes
Bidirectional Object-Water Attention module improves visual fidelity
Attribute-correlated Active Sampling dynamically selects relevant synthetic samples
🔎 Similar Papers
No similar papers found.
Y
Yu Guo
Hong Kong JC STEM Lab of Smart City and Department of Computer Science, City University of Hong Kong
Shengfeng He
Shengfeng He
Singapore Management University
Visual ComputingGenerative ModelsComputer VisionComputational PhotographyComputer Graphics
Y
Yuxu Lu
The Hong Kong Polytechnic University
H
Haonan An
Hong Kong JC STEM Lab of Smart City and Department of Computer Science, City University of Hong Kong
Yihang Tao
Yihang Tao
City University of Hong Kong
Collaborative PerceptionAutonomous DrivingWorld Model
H
Huilin Zhu
Wuhan University of Science and Technology
J
Jingxian Liu
State Key Laboratory of Maritime Technology and Safety, Wuhan University of Technology
Y
Yuguang Fang
Hong Kong JC STEM Lab of Smart City and Department of Computer Science, City University of Hong Kong