Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training

📅 2024-08-27
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF

career value

196K/year
🤖 AI Summary
Mainstream parameter-efficient fine-tuning (PEFT) methods in medical image self-supervised learning typically initialize newly introduced parameters randomly, resulting in weak representation capability and poor generalization. To address this, we propose Target Parameter Pretraining (TPP), a novel framework that—while keeping the pretrained backbone frozen—performs downstream data-driven self-supervised pretraining exclusively on *all* newly introduced PEFT parameters (e.g., LoRA weights, adapter modules). TPP is the first to extend the pretraining paradigm to *all trainable parameters* within PEFT methods. It is fully compatible with mainstream PEFT approaches and generalizes across multimodal medical imaging (CT, MRI, US) and diverse tasks (classification, segmentation). Evaluated on five public medical imaging benchmarks, TPP consistently improves PEFT performance by 3.2–7.8% on average, accelerates convergence by ~40%, and significantly enhances both fine-tuning efficiency and cross-dataset generalization.

Technology Category

Application Category

📝 Abstract
Parameter-efficient fine-tuning (PEFT) techniques have emerged to address issues of overfitting and high computational costs associated with fully fine-tuning in the paradigm of self-supervised learning. Mainstream methods based on PEFT involve adding a few trainable parameters while keeping the pre-trained parameters of the backbone fixed. These methods achieve comparative, and often superior, performance to fully fine-tuning, demonstrating the powerful representation ability of the pre-trained backbone. Despite its success, these methods typically ignore the initialization of the new parameters, often relying solely on random initialization. We argue that if pre-training is significantly beneficial, it should be applied to all parameters requiring representational capacity. Motivated by this insight, we propose a simple yet effective fine-tuning framework based on Target Parameter Pre-training (TPP). The target parameters refer to the new parameters introduced during fine-tuning. TPP includes an additional stage before PEFT to pre-train these target parameters. During this stage, the pre-trained backbone parameters are frozen, and only the target parameters are trainable. A defined pre-text task is used to encourage the target parameters to learn specific representations of downstream data. When PEFT is subsequently employed, the pre-trained target parameters are loaded to enhance fine-tuning efficiency. The proposed TPP framework is versatile, allowing for the integration of various pretext tasks for pre-training and supporting different PEFT methods as backbones. We evaluated the fine-tining performance of our method using five public datasets, including three modalities and two task types. The results demonstrate that the proposed TPP can be easily integrated into existing PEFT methods, significantly improving performance.
Problem

Research questions and friction points this paper is trying to address.

Addresses overfitting and high computational costs in medical image analysis
Improves initialization of new parameters in fine-tuning via pre-training
Enhances performance across diverse datasets and medical imaging modalities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Target Parameter Pre-training enhances PEFT
Pre-trains new parameters before fine-tuning
Improves performance across multiple datasets
🔎 Similar Papers
No similar papers found.
X
Xingliang Lei
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, China
Y
Yiwen Ye
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, China
Ziyang Chen
Ziyang Chen
Peking University
Quantum key distributionQuantum random number generation
M
Minglei Shu
Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), China
Y
Yong Xia
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, China