🤖 AI Summary
In population pharmacokinetic (PopPK) modeling, the identification of nonlinear covariates remains challenging due to poor interpretability and model dependence. To address this, we propose a VAE-LASSO joint framework: first, a variational autoencoder (VAE) compresses high-dimensional PK time-series data into a structured latent space, enabling assumption-free feature disentanglement; second, LASSO regression performs sparse covariate selection within this latent space. This work is the first to integrate VAE representation learning with LASSO for unsupervised covariate discovery in PopPK—without requiring predefined pharmacological models. The method robustly identifies four clinically critical covariates—SNP, age, albumin, and hemoglobin—while eliminating redundant features. VAE reconstruction accuracy achieves a mean absolute percentage error (MAPE) of only 2.26%, and LASSO yields consistent selections across multiple regularization strengths, demonstrating strong robustness, interpretability, and generalizability.
📝 Abstract
Population pharmacokinetic (PopPK) modelling is a fundamental tool for understanding drug behaviour across diverse patient populations and enabling personalized dosing strategies to improve therapeutic outcomes. A key challenge in PopPK analysis lies in identifying and modelling covariates that influence drug absorption, as these relationships are often complex and nonlinear. Traditional methods may fail to capture hidden patterns within the data. In this study, we propose a data-driven, model-free framework that integrates Variational Autoencoders (VAEs) deep learning model and LASSO regression to uncover key covariates from simulated tacrolimus pharmacokinetic (PK) profiles. The VAE compresses high-dimensional PK signals into a structured latent space, achieving accurate reconstruction with a mean absolute percentage error (MAPE) of 2.26%. LASSO regression is then applied to map patient-specific covariates to the latent space, enabling sparse feature selection through L1 regularization. This approach consistently identifies clinically relevant covariates for tacrolimus including SNP, age, albumin, and hemoglobin which are retained across the tested regularization strength levels, while effectively discarding non-informative features. The proposed VAE-LASSO methodology offers a scalable, interpretable, and fully data-driven solution for covariate selection, with promising applications in drug development and precision pharmacotherapy.