🤖 AI Summary
This work proposes LEMMA, a lightweight semantic segmentation method for marine remote sensing imagery, addressing the challenge of high computational cost that hinders real-time deployment on resource-constrained devices. The key innovation lies in the novel integration of the Laplacian pyramid into marine segmentation tasks, enabling early fusion of edge information during feature extraction and thereby reducing reliance on deep, computationally expensive feature maps. Extensive experiments demonstrate that LEMMA achieves 93.42% IoU on the Oil Spill dataset and 98.97% mIoU on Mastr1325, while significantly improving efficiency: it reduces model parameters by 71×, decreases GFLOPs by 88.5%, and shortens inference time by 84.65% compared to existing approaches, effectively balancing accuracy and computational efficiency.
📝 Abstract
Semantic segmentation in marine environments is crucial for the autonomous navigation of unmanned surface vessels (USVs) and coastal Earth Observation events such as oil spills. However, existing methods, often relying on deep CNNs and transformer-based architectures, face challenges in deployment due to their high computational costs and resource-intensive nature. These limitations hinder the practicality of real-time, low-cost applications in real-world marine settings.
To address this, we propose LEMMA, a lightweight semantic segmentation model designed specifically for accurate remote sensing segmentation under resource constraints. The proposed architecture leverages Laplacian Pyramids to enhance edge recognition, a critical component for effective feature extraction in complex marine environments for disaster response, environmental surveillance, and coastal monitoring. By integrating edge information early in the feature extraction process, LEMMA eliminates the need for computationally expensive feature map computations in deeper network layers, drastically reducing model size, complexity and inference time. LEMMA demonstrates state-of-the-art performance across datasets captured from diverse platforms while reducing trainable parameters and computational requirements by up to 71x, GFLOPs by up to 88.5\%, and inference time by up to 84.65\%, as compared to existing models. Experimental results highlight its effectiveness and real-world applicability, including 93.42\% IoU on the Oil Spill dataset and 98.97\% mIoU on Mastr1325.