Traditional machine learning vs. deep learning from dynamic graph representations of proteins' 3D folds in the task of protein structure classification

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This study investigates the effective use of dynamic graph representations of protein three-dimensional structures for structural classification. Leveraging dynamic protein structure networks (PSNs), it presents the first systematic comparison of traditional machine learning and deep learning approaches within a unified framework. The evaluation encompasses 72 datasets comprising approximately 44,000 dynamic PSNs, contrasting handcrafted features paired with conventional classifiers against end-to-end deep graph models. Results demonstrate that both paradigms achieve comparable classification accuracy; however, deep learning methods incur, on average, more than tenfold higher computational cost. This work highlights that traditional machine learning offers substantial efficiency advantages over deep learning while maintaining competitive predictive performance in protein structure classification tasks.

📝 Abstract

Protein structure classification (PSC) uses supervised learning to predict a protein's CATH/SCOP(e) class from the protein's sequence or 3D structural feature(s). We already modeled 3D structures as (static) protein structure networks (PSNs), demonstrating the competitiveness of PSN-based features to sequence or direct (i.e. non-network) 3D structural features in the PSC task. More recently, we demonstrated the power of features extracted from dynamic PSNs over features extracted from static PSNs (and thus by transitivity over sequence and direct 3D structural features) in the same task. That dynamic PSN approach used traditional machine learning (ML), combining manual (pre-engineered) features with an off-the-shelf classifier. Here, we evaluate whether automatic deep learning (DL) from the dynamic PSNs yields improvements. Our evaluation on 72 datasets spanning ~44,000 CATH- or SCOPe-labeled dynamic PSNs reveals that in terms of PSC accuracy, traditional ML and DL are (close to) tied for a large majority of the datasets, while DL is on average 10+ times slower. We are the first to evaluate traditional ML vs. DL in the dynamic PSN-based PSC task.

Problem

Research questions and friction points this paper is trying to address.

protein structure classification

dynamic protein structure networks

traditional machine learning

deep learning

CATH/SCOPe

Innovation

Methods, ideas, or system contributions that make the work stand out.

dynamic protein structure networks

protein structure classification

traditional machine learning