🤖 AI Summary
The Evoformer architecture in AlphaFold suffers from high computational overhead and inflexible depth due to its discrete 48-layer stack. Method: This work introduces Neural Ordinary Differential Equations (Neural ODEs) into protein structure prediction—first of its kind—proposing a continuous-depth Evoformer. It preserves the original attention mechanisms and geometric modeling while formulating hidden-state evolution over “continuous depth” via an ODE; adaptive step-size solvers balance accuracy and speed, and the adjoint method enables memory-constant backpropagation. Contribution/Results: Trained on a single GPU in just 17.5 hours, the model accurately reconstructs secondary structures such as α-helices. Experiments demonstrate substantial reductions in both computational cost and memory footprint, validating the efficacy and lightweight potential of continuous-depth modeling for protein folding.
📝 Abstract
Recent advances in protein structure prediction, such as AlphaFold, have demonstrated the power of deep neural architectures like the Evoformer for capturing complex spatial and evolutionary constraints on protein conformation. However, the depth of the Evoformer, comprising 48 stacked blocks, introduces high computational costs and rigid layerwise discretization. Inspired by Neural Ordinary Differential Equations (Neural ODEs), we propose a continuous-depth formulation of the Evoformer, replacing its 48 discrete blocks with a Neural ODE parameterization that preserves its core attention-based operations. This continuous-time Evoformer achieves constant memory cost (in depth) via the adjoint method, while allowing a principled trade-off between runtime and accuracy through adaptive ODE solvers. Benchmarking on protein structure prediction tasks, we find that the Neural ODE-based Evoformer produces structurally plausible predictions and reliably captures certain secondary structure elements, such as alpha-helices, though it does not fully replicate the accuracy of the original architecture. However, our model achieves this performance using dramatically fewer resources, just 17.5 hours of training on a single GPU, highlighting the promise of continuous-depth models as a lightweight and interpretable alternative for biomolecular modeling. This work opens new directions for efficient and adaptive protein structure prediction frameworks.