Protein-SE(3): Benchmarking SE(3)-based Generative Models for Protein Structure Design

📅 2025-07-27

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Despite the growing promise of SE(3)-equivariant generative models in protein structure design, the field lacks modular, standardized benchmarks for fair, systematic, and reproducible evaluation. Method: We introduce the first modular SE(3) design benchmark platform specifically for protein backbone generation, built upon a unified training and evaluation framework. It integrates mainstream generative paradigms—including denoising diffusion probabilistic models (DDPM), score matching, and flow matching—and supports state-of-the-art models such as Genie1/2, FrameDiff, RfDiffusion, FoldFlow, and FrameFlow. The platform employs high-order SE(3)-equivariant geometric representations and abstracted algorithmic interfaces to decouple model implementation from structural priors, enabling rapid prototyping. Contribution/Results: It provides comprehensive, multi-dimensional evaluation across geometric, physical, and functional metrics—enabling the first reproducible, scalable, and verifiable cross-method, cross-architecture comparison—thereby advancing standardization and comparability in protein generative modeling.

Technology Category

Application Category

📝 Abstract

SE(3)-based generative models have shown great promise in protein geometry modeling and effective structure design. However, the field currently lacks a modularized benchmark to enable comprehensive investigation and fair comparison of different methods. In this paper, we propose Protein-SE(3), a new benchmark based on a unified training framework, which comprises protein scaffolding tasks, integrated generative models, high-level mathematical abstraction, and diverse evaluation metrics. Recent advanced generative models designed for protein scaffolding, from multiple perspectives like DDPM (Genie1 and Genie2), Score Matching (FrameDiff and RfDiffusion) and Flow Matching (FoldFlow and FrameFlow) are integrated into our framework. All integrated methods are fairly investigated with the same training dataset and evaluation metrics. Furthermore, we provide a high-level abstraction of the mathematical foundations behind the generative models, enabling fast prototyping of future algorithms without reliance on explicit protein structures. Accordingly, we release the first comprehensive benchmark built upon unified training framework for SE(3)-based protein structure design, which is publicly accessible at https://github.com/BruthYU/protein-se3.

Problem

Research questions and friction points this paper is trying to address.

Lacks modularized benchmark for SE(3)-based protein models

Needs unified framework for fair method comparison

Requires mathematical abstraction for algorithm prototyping

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified training framework for protein design

Integrated diverse SE(3) generative models

Mathematical abstraction enabling fast prototyping

🔎 Similar Papers

No similar papers found.