ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search

πŸ“… 2025-06-01
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
The core challenge in protein inverse folding lies in generating diverse sequences that fold into the same target structure (β€œone-structure-multiple-sequences”), whereas existing methods primarily focus on recovering native sequences and lack explicit diversity modeling. This paper introduces the first reward-guided tree search framework for inverse folding, featuring a novel two-stage focus-anchor action mechanism and a jumpy denoising evaluation strategy, enabling dynamic expansion of search depth and breadth at inference time. The method integrates a pre-trained protein language model, self-evaluation, forward-looking prediction, backtracking decision-making, and jumpy denoising. Evaluated across multiple benchmarks, it significantly outperforms state-of-the-art methods, producing high-quality designs with superior structural fidelity, enhanced sequence diversity, and substantial deviation from natural sequences.

Technology Category

Application Category

πŸ“ Abstract
Designing protein sequences that fold into a target 3D structure, known as protein inverse folding, is a fundamental challenge in protein engineering. While recent deep learning methods have achieved impressive performance by recovering native sequences, they often overlook the one-to-many nature of the problem: multiple diverse sequences can fold into the same structure. This motivates the need for a generative model capable of designing diverse sequences while preserving structural consistency. To address this trade-off, we introduce ProtInvTree, the first reward-guided tree-search framework for protein inverse folding. ProtInvTree reformulates sequence generation as a deliberate, step-wise decision-making process, enabling the exploration of multiple design paths and exploitation of promising candidates through self-evaluation, lookahead, and backtracking. We propose a two-stage focus-and-grounding action mechanism that decouples position selection and residue generation. To efficiently evaluate intermediate states, we introduce a jumpy denoising strategy that avoids full rollouts. Built upon pretrained protein language models, ProtInvTree supports flexible test-time scaling by expanding the search depth and breadth without retraining. Empirically, ProtInvTree outperforms state-of-the-art baselines across multiple benchmarks, generating structurally consistent yet diverse sequences, including those far from the native ground truth.
Problem

Research questions and friction points this paper is trying to address.

Designing diverse protein sequences for target 3D structures
Overcoming one-to-many challenge in protein inverse folding
Ensuring structural consistency while generating varied sequences
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reward-guided tree-search framework for protein inverse folding
Two-stage focus-and-grounding action mechanism
Jumpy denoising strategy for intermediate state evaluation
πŸ”Ž Similar Papers
No similar papers found.
Mengdi Liu
Mengdi Liu
Institute of Computing Technology, Chinese Academy of Sciences
Diffusion modelsAI4Science
Xiaoxue Cheng
Xiaoxue Cheng
Renmin University of China
Z
Zhangyang Gao
AI Lab, Research Center for Industries of the Future, Westlake University
Hong Chang
Hong Chang
Researcher at Institute of Computing Technology, Chinese Academy of Sciences
Machine LearningComputer VisionPattern Recognition
C
Cheng Tan
AI Lab, Research Center for Industries of the Future, Westlake University
Shiguang Shan
Shiguang Shan
Professor of Institute of Computing Technology, Chinese Academy of Sciences
Computer VisionPattern RecognitionMachine LearningFace Recognition
X
Xilin Chen
Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences