๐ค AI Summary
Chemical engineering faces severe data silos, as proprietary process and molecular data held by enterprises remain inaccessible for sharing, severely limiting the generalizability of multiscale machine learning models. To address this, this work pioneers the application of federated learning to multiscale modeling in chemical engineering. We propose a novel federated architecture integrating graph neural networks (GNNs) and autoencoders: GNNs perform distributed prediction of binary mixture activity coefficients, while autoencoders enable system identification of distillation columns. Participating parties train locally and exchange only encrypted model parametersโraw data never leave their premises. Experiments demonstrate that the federated model achieves significantly higher accuracy than single-site training and closely approaches the performance of centralized training on full aggregated data. This validates the effectiveness and feasibility of privacy-preserving, cross-enterprise collaborative modeling in chemical engineering.
๐ Abstract
We present a perspective on federated learning in chemical engineering that envisions collaborative efforts in machine learning (ML) developments within the chemical industry. Large amounts of chemical and process data are proprietary to chemical companies and are therefore locked in data silos, hindering the training of ML models on large data sets in chemical engineering. Recently, the concept of federated learning has gained increasing attention in ML research, enabling organizations to jointly train machine learning models without disclosure of their individual data. We discuss potential applications of federated learning in several fields of chemical engineering, from the molecular to the process scale. In addition, we apply federated learning in two exemplary case studies that simulate practical scenarios of multiple chemical companies holding proprietary data sets: (i) prediction of binary mixture activity coefficients with graph neural networks and (ii) system identification of a distillation column with autoencoders. Our results indicate that ML models jointly trained with federated learning yield significantly higher accuracy than models trained by each chemical company individually and can perform similarly to models trained on combined datasets from all companies. Federated learning has therefore great potential to advance ML models in chemical engineering while respecting corporate data privacy, making it promising for future industrial applications.