🤖 AI Summary
The internal mechanisms of protein structure prediction models—such as AlphaFold2 and OpenFold—remain largely opaque, hindering model diagnosis, optimization, and biological interpretation.
Method: This work introduces an explainability framework based on ablation analysis and component-wise contribution quantification to systematically assess the impact of individual OpenFold modules on prediction accuracy. Using multi-scale metrics—including pLDDT and RMSD—we evaluate module contributions across a diverse set of protein targets and analyze their dependence on sequence length.
Contribution/Results: We reveal pronounced protein-specificity in the contributions of core modules (e.g., Evoformer, Structure Module), with several components exhibiting nonlinear sensitivity to protein length. This study presents the first fine-grained, component-level attribution analysis of OpenFold, delivering a reproducible methodological framework and empirical evidence to support model debugging, lightweight architecture design, and enhanced biological interpretability.
📝 Abstract
Models such as AlphaFold2 and OpenFold have transformed protein structure prediction, yet their inner workings remain poorly understood. We present a methodology to systematically evaluate the contribution of individual OpenFold components to structure prediction accuracy. We identify several components that are critical for most proteins, while others vary in importance across proteins. We further show that the contribution of several components is correlated with protein length. These findings provide insight into how OpenFold achieves accurate predictions and highlight directions for interpreting protein prediction networks more broadly.