🤖 AI Summary
Existing prescriptive process monitoring (PresPM) evaluation methods suffer from the absence of counterfactual outcomes and fail to model temporal dependencies; approaches like RealCause rely solely on a single TARNet architecture and neglect sequential process dynamics. Method: This paper proposes a counterfactual generation framework integrating temporal modeling and multi-model ensembling. It employs LSTM to capture time-series dependencies in process data and combines S-Learner, T-Learner, TARNet, and ensemble learning strategies into a robust deep generative model. Contribution/Results: Experiments on both synthetic and real-world clinical process datasets demonstrate that the proposed method significantly improves counterfactual prediction accuracy and evaluation stability. It provides a more reliable, generalizable, and temporally aware benchmark for assessing PresPM techniques—addressing critical limitations in current causal inference–based process monitoring evaluation.
📝 Abstract
Prescriptive Process Monitoring (PresPM) is the subfield of Process Mining that focuses on optimizing processes through real-time interventions based on event log data. Evaluating PresPM methods is challenging due to the lack of ground-truth outcomes for all intervention actions in datasets. A generative deep learning approach from the field of Causal Inference (CI), RealCause, has been commonly used to estimate the outcomes for proposed intervention actions to evaluate a new policy. However, RealCause overlooks the temporal dependencies in process data, and relies on a single CI model architecture, TARNet, limiting its effectiveness. To address both shortcomings, we introduce ProCause, a generative approach that supports both sequential (e.g., LSTMs) and non-sequential models while integrating multiple CI architectures (S-Learner, T-Learner, TARNet, and an ensemble). Our research using a simulator with known ground truths reveals that TARNet is not always the best choice; instead, an ensemble of models offers more consistent reliability, and leveraging LSTMs shows potential for improved evaluations when temporal dependencies are present. We further validate ProCause's practical effectiveness through a real-world data analysis, ensuring a more reliable evaluation of PresPM methods.