Keeping Representation Similarity in Finetuning for Medical Image Analysis

πŸ“… 2025-03-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In medical image fine-tuning, pretrained foundation model representations often degenerate, undermining generalization. To address this, we propose RepSimβ€”a novel method that introduces, for the first time, a learnable, similarity-invariant orthogonal manifold constraint to jointly optimize task adaptability and pretrained representation fidelity during fine-tuning. RepSim integrates orthogonal manifold optimization, representation similarity regularization, and contrastive distance minimization. Evaluated on five medical image classification benchmarks, it improves representation similarity by over 30%, reduces model sharpness by 42%, and maintains competitive classification accuracy. Our core contributions are: (1) a representation-similarity-driven manifold constraint paradigm that preserves structural invariance in latent space; and (2) a unified optimization objective that simultaneously retains pretrained knowledge and enhances downstream performance. RepSim thus bridges the gap between representation stability and task-specific adaptation in medical imaging.

Technology Category

Application Category

πŸ“ Abstract
Foundation models pretrained on large-scale natural images have been widely used to adapt to medical image analysis through finetuning. This is largely attributed to pretrained representations capturing universal, robust, and generalizable features, which can be reutilized by downstream tasks. However, these representations are later found to gradually vanish during finetuning, accompanied by a degradation of foundation model's original abilities, e.g., generalizability. In this paper, we argue that pretrained representations can be well preserved while still effectively adapting to downstream tasks. We study this by proposing a new finetuning method RepSim, which minimizes the distance between pretrained and finetuned representations via constraining learnable orthogonal manifold based on similarity invariance. Compared to standard finetuning methods, e.g., full finetuning, our method improves representation similarity by over 30% while maintaining competitive accuracy, and reduces sharpness by 42% across five medical image classification datasets. The code will be released.
Problem

Research questions and friction points this paper is trying to address.

Preserve pretrained representations during finetuning for medical image analysis.
Prevent degradation of foundation model's generalizability in downstream tasks.
Improve representation similarity and reduce sharpness in medical image classification.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Preserves pretrained representations during finetuning
Uses orthogonal manifold for similarity invariance
Improves representation similarity by over 30%
πŸ”Ž Similar Papers
No similar papers found.
W
Wenqiang Zu
Institute of Automation, Chinese Academy of Sciences; Beijing Academy of Artificial Intelligence
Shenghao Xie
Shenghao Xie
Ph.D. Student, AAIS, PKU
Computer VisionMachine Learning
H
Hao Chen
School of Chemical Sciences, University of Chinese Academy of Sciences
Yiming Liang
Yiming Liang
Institute of Automation of the Chinese Academy Sciences (CASIA), M-A-P
LLM
L
Lei Ma
Academy for Advanced Interdisciplinary Studies, National Biomedical Imaging Center, National Key Laboratory for Multimedia Information Processing, Peking University; Beijing Academy of Artificial Intelligence