Generalizable, Fast, and Accurate DeepQSPR with fastprop Part 1: Framework and Benchmarks

📅 2024-04-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

234K/year
🤖 AI Summary
Traditional QSPR methods rely on expert-crafted molecular descriptors, suffering from poor generalizability and limited adaptability. This paper proposes *fastprop*, a lightweight QSPR framework that combines physicochemically inspired handcrafted features with a multilayer perceptron (MLP), deliberately avoiding complex graph neural networks (GNNs) or Transformers for end-to-end representation learning—thereby balancing interpretability, predictive accuracy, and computational efficiency. Through systematic hyperparameter optimization and cross-dataset standardized preprocessing, *fastprop* achieves significantly lower mean absolute error (MAE) and root-mean-square error (RMSE) than leading models—including GCN, MPNN, and D-MPNN—across 23 molecular property prediction benchmarks. It reduces training time by over 90% and accelerates inference by one to two orders of magnitude. The implementation and pretrained models are publicly available.

Technology Category

Application Category

📝 Abstract
Quantitative Structure Property Relationship studies aim to define a mapping between molecular structure and arbitrary quantities of interest. This was historically accomplished via the development of descriptors which requires significant domain expertise and struggles to generalize. Thus the field has morphed into Molecular Property Prediction and been given over to learned representations which are highly generalizable. The paper introduces fastprop, a DeepQSPR framework which uses a cogent set of molecular level descriptors to meet and exceed the performance of learned representations on diverse datasets in dramatically less time. fastprop is freely available on github at github.com/JacksonBurns/fastprop.
Problem

Research questions and friction points this paper is trying to address.

Quantitative Structure-Property Relationships
Descriptor Engineering
Predictive Modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

fastprop
learning descriptors
DeepQSPR framework