Proteo-R1: Reasoning Foundation Models for De Novo Protein Design

📅 2026-05-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

179K/year
🤖 AI Summary
Current de novo protein design methods lack explicit reasoning mechanisms, making it difficult to distinguish functionally critical residues from the structural generation process, thereby limiting interpretability and controllability. This work proposes Proteo-R1, a novel framework that, for the first time, integrates residue-level functional reasoning as a hard constraint into the design pipeline: a multimodal large language model identifies functionally essential sites, and a diffusion model performs conditional geometric co-design under this constraint. This dual-expert architecture decouples molecular understanding from structure generation, enabling stable and modular integration of LLM-based reasoning with generative modeling. The approach significantly enhances design fidelity, interpretability, and the reuse of biochemical knowledge, allowing precise anchoring and systematic modulation of key interaction sites.
📝 Abstract
Deep learning in \emph{de novo} protein design has achieved atomic-level fidelity. However, existing models remain largely non-deliberative: they directly synthesize molecular geometries without explicitly reasoning about which residues or interactions are functionally essential. As a result, design decisions are entangled with continuous sampling dynamics, limiting interpretability, controllability, and systematic reuse of biochemical knowledge. We introduce \textbf{Proteo-R1}, a reasoning-guided protein design framework that explicitly decouples \emph{molecular understanding} from \emph{geometric generation}. Proteo-R1 adopts a dual-expert architecture in which a multimodal large language model (MLLM) serves as an \emph{understanding expert}, analyzing protein sequences, structures, and textual context to identify key functional residues that govern binding and specificity. These residue-level decisions are then passed as hard constraints to a separate diffusion-based \emph{generation expert}, which performs conditional co-design while respecting the fixed interaction anchors. This factorization mirrors how human experts approach molecular engineering: first, reasoning about critical interactions, then optimizing geometry subject to those constraints. By operationalizing reasoning as explicit residue-level commitments rather than latent textual guidance, Proteo-R1 achieves stable, interpretable, and modular integration of LLM reasoning with state-of-the-art geometric generative models. Code, data, and demos are available at https://smiles724.github.io/r1/.
Problem

Research questions and friction points this paper is trying to address.

de novo protein design
reasoning
functional residues
interpretability
controllability
Innovation

Methods, ideas, or system contributions that make the work stand out.

reasoning-guided design
de novo protein design
multimodal large language model
diffusion-based generation
residue-level constraints
Fang Wu
Fang Wu
Stanford University
AIDeep Learning
Weihao Xuan
Weihao Xuan
The University of Tokyo, RIKEN
Natural Language ProcessingComputer VisionMultimodal AIGenerative AILLM Agent
Heli Qi
Heli Qi
Waseda University, RIKEN
Multi-Modal Learning
Hanqun Cao
Hanqun Cao
The Chinese University of Hong Kong
Generative ModelingAI4Science
Heng-Jui Chang
Heng-Jui Chang
Massachusetts Institute of Technology
Speech ProcessingDeep Learning
Z
Zeqi Zhou
University of Tokyo, RIKEN AIP
Haokai Zhao
Haokai Zhao
University of New South Wales
Deep Learning
M
Ma Jian
Shanghai Jiao Tong University
C
Carl Ma
Stanford University
Y
Yu-Chi Cheng
Harvard University
K
Kuan Pang
Stanford University
X
Xiangru Tang
Yale University
Zehong Wang
Zehong Wang
University of Notre Dame
Machine LearningFoundation ModelGraph Learning
G
Guanlue Li
University of Hamburg, Germany
H
Hanchen Wang
Stanford University
K
Kejun Ying
Stanford University
Pan Lu
Pan Lu
Stanford University
Machine LearningNatural Language ProcessingMachine ReasoningMathematical Reasoning
C
Chiho Im
Stanford University
Seungju Han
Seungju Han
Stanford University
Deep LearningMachine Learning
Peng Xia
Peng Xia
PhD student, Department of Computer Science, UNC Chapel Hill
Multimodal AgentHealthcare
T
Tinson Xu
University of Chicago, USA
Y
Yinxi Li
University of Waterloo
Deyao Zhu
Deyao Zhu
Research Scientist, ByteDance Seed
Reinforcement LearningVision Language Models
P
Pheng-Ann Heng
Chinese University of Hong Kong
Naoto Yokoya
Naoto Yokoya
The University of Tokyo, RIKEN
Remote SensingComputer VisionMachine LearningData Fusion