🤖 AI Summary
This study addresses the urgent need for digital twin decision-support tools in wastewater treatment plants capable of handling irregularly sampled, missing, heavy-tailed, and zero-inflated sensor data while enabling 12–36 hour planning horizons. To this end, the authors propose the CCSS-RS model, a data-driven open-loop simulation framework that decouples historical state inference from future control and exogenous variable forecasting. The approach integrates typed contextual encoding, gain-weighting mechanisms, semigroup-consistent unfolding, and Student-t barrier outputs within a continuous-time state-space formulation inspired by neural differential equations, thereby achieving context-aware probabilistic forecasting. Evaluated on the Avedøre benchmark, the model achieves an RMSE of 0.696, representing a 40–46% improvement over Neural CDE baselines, and demonstrates robust performance across multiple operational scenarios.
📝 Abstract
Wastewater treatment plants (WWTPs) need digital-twin-style decision support tools that can simulate plant response under prescribed control plans, tolerate irregular and missing sensing, and remain informative over 12-36 h planning horizons. Meeting these requirements with full-scale plant data remains an open engineering-AI challenge. We present CCSS-RS, a controlled continuous-time state-space model that separates historical state inference from future control and exogenous rollout. The model combines typed context encoding, gain-weighted forcing of prescribed and forecast drivers, semigroup-consistent rollouts, and Student-t plus hurdle outputs for heavy-tailed and zero-inflated WWTP sensor data. On the public Avedøre full-scale benchmark, with 906,815 timesteps, 43% missingness, and 1-20 min irregular sampling, CCSS-RS achieves RMSE 0.696 and CRPS 0.349 at H=1000 across 10,000 test windows. This reduces RMSE by 40-46% relative to Neural CDE baselines and by 31-35% relative to simplified internal variants. Four case studies using a frozen checkpoint on test data demonstrate operational value: oxygen-setpoint perturbations shift predicted ammonium by -2.3 to +1.4 over horizons 300-1000; a smoothed setpoint plan ranks first in multi-criterion screening; context-only sensor outages raise monitored-variable RMSE by at most 10%; and ammonium, nitrate, and oxygen remain more accurate than persistence throughout the rollout. These results establish CCSS-RS as a practical learned simulator for offline scenario screening in industrial wastewater treatment, complementary to mechanistic models.