End-to-End Reinforcement Learning of Koopman Models for eNMPC of an Air Separation Unit

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
For large-scale single-product air separation units (ASUs), demand response control faces challenges including limited observable variables, frequent constraint violations, and difficulty in guaranteeing economic performance. To address these, this paper proposes an end-to-end reinforcement learning–driven Koopman surrogate modeling framework for economic nonlinear model predictive control (eNMPC). It is the first work to integrate deep reinforcement learning with Koopman operator theory, enabling data-driven, low-dimensional, and interpretable dynamic surrogate modeling. Evaluated on a high-fidelity ASU simulation platform, the method significantly reduces constraint violations while achieving economic performance comparable to conventional system identification–based approaches. Moreover, it demonstrates superior robustness against measurement noise and model mismatch. The framework exhibits strong scalability and engineering applicability, making it suitable for real-world industrial deployment in complex process systems.

Technology Category

Application Category

📝 Abstract
With our recently proposed method based on reinforcement learning (Mayfrank et al. (2024), Comput. Chem. Eng. 190), Koopman surrogate models can be trained for optimal performance in specific (economic) nonlinear model predictive control ((e)NMPC) applications. So far, our method has exclusively been demonstrated on a small-scale case study. Herein, we show that our method scales well to a more challenging demand response case study built on a large-scale model of a single-product (nitrogen) air separation unit. Across all numerical experiments, we assume observability of only a few realistically measurable plant variables. Compared to a purely system identification-based Koopman eNMPC, which generates small economic savings but frequently violates constraints, our method delivers similar economic performance while avoiding constraint violations.
Problem

Research questions and friction points this paper is trying to address.

Develops reinforcement learning for Koopman models in economic nonlinear predictive control
Scales methodology to large-scale air separation unit demand response optimization
Improves constraint satisfaction while maintaining economic performance over system identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end reinforcement learning for Koopman models
Training surrogates for economic nonlinear predictive control
Scalable method avoiding constraint violations in control
Daniel Mayfrank
Daniel Mayfrank
Doctoral student, Forschungszentrum Jülich GmbH, Institute of Energy and Climate Research
Optimal controlMachine learning
K
Kayra Dernek
RWTH Aachen University, Process Systems Engineering (AVT.SVT), Aachen 52074, Germany
L
Laura Lang
RWTH Aachen University, Process Systems Engineering (AVT.SVT), Aachen 52074, Germany
Alexander Mitsos
Alexander Mitsos
AVT Systemverfahrenstechnik, RWTH Aachen University and Energy Systems Engineering IEK-10
process systems engineeringenergy systemsglobal optimizationbilevel optimizationprocess
M
M. Dahmen
Forschungszentrum Jülich GmbH, Institute of Climate and Energy Systems, Energy Systems Engineering (ICE-1), Jülich 52425, Germany