InstructPLM-mu: 1-Hour Fine-Tuning of ESM2 Beats ESM3 in Protein Mutation Predictions

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Can lightweight multimodal fine-tuning replace costly end-to-end pretraining to enhance sequence models’ performance in mutation effect prediction? This work proposes InstructPLM-mu, a framework that injects 3D structural information into the sequence-only model ESM2, designs a multimodal feature fusion mechanism, and employs an efficient fine-tuning strategy. With only one hour of fine-tuning, InstructPLM-mu achieves performance on par with end-to-end trained ESM3 across major mutation prediction benchmarks—yielding average Spearman correlation improvements of 0.04–0.12. To our knowledge, this is the first systematic demonstration that geometry-aware lightweight fine-tuning effectively bridges the geometric modeling gap of sequence-based protein language models, while reducing training computational cost by over 99%. The approach establishes a new paradigm for computationally efficient adaptation of protein language models to structure-informed downstream tasks.

Technology Category

Application Category

📝 Abstract
Multimodal protein language models deliver strong performance on mutation-effect prediction, but training such models from scratch demands substantial computational resources. In this paper, we propose a fine-tuning framework called InstructPLM-mu and try to answer a question: extit{Can multimodal fine-tuning of a pretrained, sequence-only protein language model match the performance of models trained end-to-end? } Surprisingly, our experiments show that fine-tuning ESM2 with structural inputs can reach performance comparable to ESM3. To understand how this is achieved, we systematically compare three different feature-fusion designs and fine-tuning recipes. Our results reveal that both the fusion method and the tuning strategy strongly affect final accuracy, indicating that the fine-tuning process is not trivial. We hope this work offers practical guidance for injecting structure into pretrained protein language models and motivates further research on better fusion mechanisms and fine-tuning protocols.
Problem

Research questions and friction points this paper is trying to address.

Fine-tuning pretrained protein models with structural inputs
Matching end-to-end model performance via multimodal fine-tuning
Optimizing feature fusion designs for mutation prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning ESM2 with structural inputs
Comparing three feature-fusion designs
Optimizing fine-tuning recipes for accuracy
🔎 Similar Papers
No similar papers found.
J
Junde Xu
The Chinese University of Hong Kong
Y
Yapin Shi
Hangzhou Institute for Advanced Study, CAS; University of Chinese Academy of Sciences
L
Lijun Lang
The Chinese University of Hong Kong
T
Taoyong Cui
The Chinese University of Hong Kong
Z
Zhiming Zhang
Hangzhou Institute of Medicine, CAS; Tianjin University
G
Guangyong Chen
Hangzhou Institute of Medicine, CAS
Jiezhong Qiu
Jiezhong Qiu
Zhejiang University - Zhejiang Lab Hundred Talents Program Researcher
Data MiningSocial Network AnalysisNetwork EmbeddingGraph Neural Networks
P
Pheng-Ann Heng
The Chinese University of Hong Kong