π€ AI Summary
Experimental data on diffusion coefficients in liquids remain scarce, creating an urgent need for highly accurate and physically consistent prediction methods. This work proposes ESE, the first hybrid model that rigorously adheres to fundamental physical constraints by integrating the Stokes-Einstein equation with machine learning. Requiring only molecular SMILES strings as input, ESE accurately predicts infinite-dilution diffusion coefficients of solutes in pure solvents. The model demonstrates high accuracy across a wide temperature range and diverse chemical systems, significantly outperforming the current state-of-the-art method, SEGWE, on a large-scale literature dataset. ESE combines broad applicability with open-source availability, ensuring reproducibility and facilitating further research in molecular transport properties.
π Abstract
Diffusion coefficients are key thermophysical properties for modeling mass transport in liquids, but experimental data are scarce, making reliable prediction methods indispensable. In the present work, we introduce a new method for predicting diffusion coefficients of molecular components at infinite dilution in pure liquid solvents by integrating the Stokes-Einstein (SE) equation with machine learning (ML). Unlike previous ML approaches, the resulting hybrid Enhanced Stokes-Einstein (ESE) model provides strictly physically consistent predictions for diffusion coefficients as a function of temperature across a broad range of binary mixtures. Trained and validated using an extensive compilation of literature data for infinite-dilution diffusion coefficients in binary liquid systems, ESE achieves significantly higher prediction accuracies than the previous state-of-the-art model, SEGWE, while requiring only the SMILES strings encoding of the molecular formulae of the components of interest as additional inputs, which are always available. This simplicity makes ESE broadly applicable, e.g., for process design and optimization. The ESE model and its source code are fully disclosed and are directly accessible via an interactive web interface at https://ml-prop.mv.rptu.de/.