🤖 AI Summary
In dynamic maritime environments, precise alignment between visual perception and Electronic Chart Display and Information System (ECDIS) data remains challenging due to motion blur, occlusion, and sensor uncertainty. Method: This paper proposes an end-to-end vision-chart fusion framework that jointly detects navigational aids in images and matches their bounding boxes to corresponding geographic chart symbols—bypassing conventional camera calibration and ray-casting steps. The method integrates YOLOv7-based detection, differentiable camera projection modeling, and real-time streaming data fusion within a Transformer-based joint detection-and-matching architecture. Contribution/Results: Evaluated on a real-world maritime video dataset, the approach achieves significantly higher buoy matching accuracy than both standard ray-casting and an enhanced YOLOv7 baseline, demonstrating robustness under adverse sea conditions and practical viability for low-latency, high-precision maritime situational awareness.
📝 Abstract
This paper presents a novel approach to enhancing marine vision by fusing real-time visual data with chart information. Our system overlays nautical chart data onto live video feeds by accurately matching detected navigational aids, such as buoys, with their corresponding representations in chart data. To achieve robust association, we introduce a transformer-based end-to-end neural network that predicts bounding boxes and confidence scores for buoy queries, enabling the direct matching of image-domain detections with world-space chart markers. The proposed method is compared against baseline approaches, including a ray-casting model that estimates buoy positions via camera projection and a YOLOv7-based network extended with a distance estimation module. Experimental results on a dataset of real-world maritime scenes demonstrate that our approach significantly improves object localization and association accuracy in dynamic and challenging environments.