EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers

📅 2026-04-10

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses the limitations of existing SE(3)-equivariant graph neural networks in three-dimensional atomic modeling—namely, computational inefficiency, limited expressivity, and insufficient physical consistency—by introducing a third-generation SE(3)-equivariant graph attention Transformer. The proposed architecture incorporates a SwiGLU-S² activation function, a smooth truncated attention mechanism, an enhanced feedforward network, and equivariant layer normalization, complemented by a DeNS denoising pretraining strategy. These innovations collectively enhance the model’s capacity to capture many-body interactions and represent smooth potential energy surfaces. The method achieves state-of-the-art performance on the OC20, OMat24, and Matbench Discovery benchmarks, accelerates training by 1.75×, and demonstrates improved generalization and physical consistency.

Technology Category

Application Category

📝 Abstract

As $SE(3)$-equivariant graph neural networks mature as a core tool for 3D atomistic modeling, improving their efficiency, expressivity, and physical consistency has become a central challenge for large-scale applications. In this work, we introduce EquiformerV3, the third generation of the $SE(3)$-equivariant graph attention Transformer, designed to advance all three dimensions: efficiency, expressivity, and generality. Building on EquiformerV2, we have the following three key advances. First, we optimize the software implementation, achieving $1.75\times$ speedup. Second, we introduce simple and effective modifications to EquiformerV2, including equivariant merged layer normalization, improved feedforward network hyper-parameters, and attention with smooth radius cutoff. Third, we propose SwiGLU-$S^2$ activations to incorporate many-body interactions for better theoretical expressivity and to preserve strict equivariance while reducing the complexity of sampling $S^2$ grids. Together, SwiGLU-$S^2$ activations and smooth-cutoff attention enable accurate modeling of smoothly varying potential energy surfaces (PES), generalizing EquiformerV3 to tasks requiring energy-conserving simulations and higher-order derivatives of PES. With these improvements, EquiformerV3 trained with the auxiliary task of denoising non-equilibrium structures (DeNS) achieves state-of-the-art results on OC20, OMat24, and Matbench Discovery.

Problem

Research questions and friction points this paper is trying to address.

SE(3)-equivariance

graph neural networks

3D atomistic modeling

efficiency

expressivity

Innovation

Methods, ideas, or system contributions that make the work stand out.

SE(3)-equivariance

graph attention Transformer

SwiGLU-S² activation