A Comparative Study of QSPR Methods on a Unique Multitask PAMPA dataset

📅 2026-05-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

241K/year
🤖 AI Summary
This study addresses the challenge of efficiently predicting passive permeability of drug molecules across multiple artificial membrane models while balancing model performance and interpretability. Leveraging a dataset of 143 drug compounds with permeability measurements in six organ-specific PAMPA membranes, the authors establish the first multi-task quantitative structure–property relationship (QSPR) benchmark, systematically evaluating modeling approaches ranging from linear regression to pretrained Transformer architectures. The results demonstrate that, under data-scarce conditions, expert-curated physicochemical descriptors substantially outperform deep learning–based representations. This approach not only elucidates membrane-specific permeation mechanisms but also provides a reliable and interpretable modeling paradigm for permeability prediction in practical drug development settings.
📝 Abstract
We present a unique, multitask dataset comprising 143 drug and drug candidate molecules, each evaluated on in vitro, parallel artificial-membrane permeability assays (PAMPA) using six different model membranes. Using this resource, we systematically assess the effectiveness of various molecular descriptors and regression models in predicting passive membrane permeability. The studied models range from simple linear regression to a modern pre-trained transformer architecture. Particular attention is given to the trade-off between predictive performance and model interpretability, highlighting the challenges introduced by machine learning approaches. To our knowledge, this is the most comprehensive study on simultaneous modeling of multiple organ-specific PAMPA membranes to date, offering novel insights into membrane-specific permeability profiles. We found that expert-designed physico-chemical property descriptors are more fitting for a limited sample size permeabilty study than deep learning based representations.
Problem

Research questions and friction points this paper is trying to address.

PAMPA
membrane permeability
QSPR
multitask learning
model interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

multitask PAMPA dataset
membrane-specific permeability
molecular descriptors
model interpretability
QSPR
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
A
András Formanek
Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, 3001 Leuven, Belgium
A
Anna Vincze
Department of Chemical and Environmental Process Engineering, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
R
Richárd Bicsak
Department of Chemical and Environmental Process Engineering, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
Yves Moreau
Yves Moreau
University of Leuven
CheminformaticsComputational BiologyBioinformaticsHuman GeneticsMachine Learning
G
György T. Balogh
Department of Chemical and Environmental Process Engineering, Faculty of Chemical Technology and Biotechnology, Budapest University of Technology and Economics, Műegyetem rkp. 3, H-1111 Budapest, Hungary
Ádám Arany
Ádám Arany
University of Leuven
machine learningchemoinformaticstime seriescausality