Rank-and-Reason: Multi-Agent Collaboration Accelerates Zero-Shot Protein Mutation Prediction

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of current protein language models in zero-shot mutation prediction, which often ignore biophysical constraints and rely on inefficient, subjective manual curation. The authors propose Rank-and-Reason (VenusRAR), the first multi-agent collaborative two-stage framework: the first stage employs a context-aware, multimodal ensemble ranking to prioritize candidate mutations, while the second stage deploys a virtual expert panel that conducts chain-of-thought reasoning incorporating geometric and structural constraints for rigorous evaluation. Evaluated on ProteinGym, VenusRAR achieves a Spearman correlation of 0.551 and a 367% improvement in Top-5 hit rate. Wet-lab validation on Cas12i3 mutants yields a 46.7% positive rate, successfully identifying two novel variants with 4.23× and 5.05× enhanced activity, substantially improving both prediction reliability and experimental efficiency.

Technology Category

Application Category

📝 Abstract
Zero-shot mutation prediction is vital for low-resource protein engineering, yet existing protein language models (PLMs) often yield statistically confident results that ignore fundamental biophysical constraints. Currently, selecting candidates for wet-lab validation relies on manual expert auditing of PLM outputs, a process that is inefficient, subjective, and highly dependent on domain expertise. To address this, we propose Rank-and-Reason (VenusRAR), a two-stage agentic framework to automate this workflow and maximize expected wet-lab fitness. In the Rank-Stage, a Computational Expert and Virtual Biologist aggregate a context-aware multi-modal ensemble, establishing a new Spearman correlation record of 0.551 (vs. 0.518) on ProteinGym. In the Reason-Stage, an agentic Expert Panel employs chain-of-thought reasoning to audit candidates against geometric and structural constraints, improving the Top-5 Hit Rate by up to 367% on ProteinGym-DMS99. The wet-lab validation on Cas12i3 nuclease further confirms the framework's efficacy, achieving a 46.7% positive rate and identifying two novel mutants with 4.23-fold and 5.05-fold activity improvements. Code and datasets are released on GitHub (https://github.com/ai4protein/VenusRAR/).
Problem

Research questions and friction points this paper is trying to address.

zero-shot protein mutation prediction
protein language models
biophysical constraints
wet-lab validation
expert auditing
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent collaboration
zero-shot protein mutation prediction
chain-of-thought reasoning
context-aware multimodal ensemble
biophysical constraint auditing
🔎 Similar Papers
No similar papers found.
Yang Tan
Yang Tan
Shanghai Jiao Tong University & Shanghai Innovation Institute
BioinformaticsDeep Learning
Y
Yuyuan Xi
Shanghai Jiao Tong University
C
Can Wu
East China University of Science and Technology
Bozitao Zhong
Bozitao Zhong
Shanghai Jiao Tong University
Computational BiologyProtein DesignDeep LearningSynthetic Biology
M
Mingchen Li
Shanghai Jiao Tong University
G
Guisheng Fan
East China University of Science and Technology
J
Jiankang Zhu
Southern University of Science and Technology
Y
Yafeng Liang
Southern University of Science and Technology
Nanqing Dong
Nanqing Dong
Shanghai Artificial Intelligence Laboratory; University of Oxford
Machine LearningComputer VisionOptimizationAI for Science
Liang Hong
Liang Hong
School of physics and astronomy & institute of natural sciences, shanghai jiao tong university,
biophysicspolymer physicswater dynamics