Small but Mighty: Dynamic Wavelet Expert-Guided Fine-Tuning of Large-Scale Models for Optical Remote Sensing Object Segmentation

📅 2026-01-14

📈 Citations: 1

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the challenge of deploying large-scale foundation models for object segmentation in optical remote sensing imagery, where full-parameter fine-tuning incurs prohibitive memory and computational costs. To this end, we propose WEFT, an efficient fine-tuning approach guided by dynamic wavelet experts. WEFT introduces, for the first time in remote sensing segmentation, a learnable wavelet expert extractor coupled with a conditional adapter, which enhances the fine-grained perceptual capabilities of a frozen foundation model while tuning only a minimal number of parameters. By integrating wavelet transformation, dynamic expert mechanisms, and parameter-efficient fine-tuning, WEFT outperforms 21 state-of-the-art methods across three remote sensing benchmarks and achieves superior performance on camouflaged, natural, and medical image segmentation tasks, significantly reducing training resource consumption.

Technology Category

Application Category

📝 Abstract

Accurately localizing and segmenting relevant objects from optical remote sensing images (ORSIs) is critical for advancing remote sensing applications. Existing methods are typically built upon moderate-scale pre-trained models and employ diverse optimization strategies to achieve promising performance under full-parameter fine-tuning. In fact, deeper and larger-scale foundation models can provide stronger support for performance improvement. However, due to their massive number of parameters, directly adopting full-parameter fine-tuning leads to pronounced training difficulties, such as excessive GPU memory consumption and high computational costs, which result in extremely limited exploration of large-scale models in existing works. In this paper, we propose a novel dynamic wavelet expert-guided fine-tuning paradigm with fewer trainable parameters, dubbed WEFT, which efficiently adapts large-scale foundation models to ORSIs segmentation tasks by leveraging the guidance of wavelet experts. Specifically, we introduce a task-specific wavelet expert extractor to model wavelet experts from different perspectives and dynamically regulate their outputs, thereby generating trainable features enriched with task-specific information for subsequent fine-tuning. Furthermore, we construct an expert-guided conditional adapter that first enhances the fine-grained perception of frozen features for specific tasks by injecting trainable features, and then iteratively updates the information of both types of feature, allowing for efficient fine-tuning. Extensive experiments show that our WEFT not only outperforms 21 state-of-the-art (SOTA) methods on three ORSIs datasets, but also achieves optimal results in camouflage, natural, and medical scenarios. The source code is available at: https://github.com/CSYSI/WEFT.

Problem

Research questions and friction points this paper is trying to address.

large-scale models

optical remote sensing

object segmentation

parameter-efficient fine-tuning

computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

wavelet expert

parameter-efficient fine-tuning

large-scale foundation models