🤖 AI Summary
This study addresses the challenge of efficiently predicting passive permeability of drug molecules across multiple artificial membrane models while balancing model performance and interpretability. Leveraging a dataset of 143 drug compounds with permeability measurements in six organ-specific PAMPA membranes, the authors establish the first multi-task quantitative structure–property relationship (QSPR) benchmark, systematically evaluating modeling approaches ranging from linear regression to pretrained Transformer architectures. The results demonstrate that, under data-scarce conditions, expert-curated physicochemical descriptors substantially outperform deep learning–based representations. This approach not only elucidates membrane-specific permeation mechanisms but also provides a reliable and interpretable modeling paradigm for permeability prediction in practical drug development settings.
📝 Abstract
We present a unique, multitask dataset comprising 143 drug and drug candidate molecules, each evaluated on in vitro, parallel artificial-membrane permeability assays (PAMPA) using six different model membranes. Using this resource, we systematically assess the effectiveness of various molecular descriptors and regression models in predicting passive membrane permeability. The studied models range from simple linear regression to a modern pre-trained transformer architecture. Particular attention is given to the trade-off between predictive performance and model interpretability, highlighting the challenges introduced by machine learning approaches. To our knowledge, this is the most comprehensive study on simultaneous modeling of multiple organ-specific PAMPA membranes to date, offering novel insights into membrane-specific permeability profiles.
We found that expert-designed physico-chemical property descriptors are more fitting for a limited sample size permeabilty study than deep learning based representations.