Protein-SE(3): Benchmarking SE(3)-based Generative Models for Protein Structure Design

📅 2025-07-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Despite the growing promise of SE(3)-equivariant generative models in protein structure design, the field lacks modular, standardized benchmarks for fair, systematic, and reproducible evaluation. Method: We introduce the first modular SE(3) design benchmark platform specifically for protein backbone generation, built upon a unified training and evaluation framework. It integrates mainstream generative paradigms—including denoising diffusion probabilistic models (DDPM), score matching, and flow matching—and supports state-of-the-art models such as Genie1/2, FrameDiff, RfDiffusion, FoldFlow, and FrameFlow. The platform employs high-order SE(3)-equivariant geometric representations and abstracted algorithmic interfaces to decouple model implementation from structural priors, enabling rapid prototyping. Contribution/Results: It provides comprehensive, multi-dimensional evaluation across geometric, physical, and functional metrics—enabling the first reproducible, scalable, and verifiable cross-method, cross-architecture comparison—thereby advancing standardization and comparability in protein generative modeling.

Technology Category

Application Category

📝 Abstract
SE(3)-based generative models have shown great promise in protein geometry modeling and effective structure design. However, the field currently lacks a modularized benchmark to enable comprehensive investigation and fair comparison of different methods. In this paper, we propose Protein-SE(3), a new benchmark based on a unified training framework, which comprises protein scaffolding tasks, integrated generative models, high-level mathematical abstraction, and diverse evaluation metrics. Recent advanced generative models designed for protein scaffolding, from multiple perspectives like DDPM (Genie1 and Genie2), Score Matching (FrameDiff and RfDiffusion) and Flow Matching (FoldFlow and FrameFlow) are integrated into our framework. All integrated methods are fairly investigated with the same training dataset and evaluation metrics. Furthermore, we provide a high-level abstraction of the mathematical foundations behind the generative models, enabling fast prototyping of future algorithms without reliance on explicit protein structures. Accordingly, we release the first comprehensive benchmark built upon unified training framework for SE(3)-based protein structure design, which is publicly accessible at https://github.com/BruthYU/protein-se3.
Problem

Research questions and friction points this paper is trying to address.

Lacks modularized benchmark for SE(3)-based protein models
Needs unified framework for fair method comparison
Requires mathematical abstraction for algorithm prototyping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified training framework for protein design
Integrated diverse SE(3) generative models
Mathematical abstraction enabling fast prototyping
🔎 Similar Papers
No similar papers found.
Lang Yu
Lang Yu
East China Normal University
Machine LearningDeep Learning
Z
Zhangyang Gao
AI Lab, Research Center for Industries of the Future, Westlake University
C
Cheng Tan
AI Lab, Research Center for Industries of the Future, Westlake University
Q
Qin Chen
School of Computer Science and Technology, East China Normal University
J
Jie Zhou
School of Computer Science and Technology, East China Normal University
L
Liang He
School of Computer Science and Technology, East China Normal University