🤖 AI Summary
Existing predictive process mining (PPM) research suffers from critical challenges: poor model reproducibility, inconsistent evaluation protocols, difficulty integrating new data, and a lack of standardized cross-method comparison. To address these issues, we introduce SPICE—the first open-source PyTorch-based benchmarking framework specifically designed for PPM. SPICE uniformly implements three dominant deep learning architectures—LSTM, GRU, and Transformer—with standardized data preprocessing, configurable training pipelines, and unified evaluation interfaces. Its modular design ensures reproducible, extensible modeling for both key performance indicator (KPI) prediction and behavioral process modeling. We rigorously reproduce experiments across 11 real-world process event logs, confirming consistency with original reported results while enabling fairer, more transparent cross-architecture comparisons. SPICE significantly advances PPM research by enhancing reproducibility, transparency, and methodological comparability.
📝 Abstract
In recent years, Predictive Process Mining (PPM) techniques based on artificial neural networks have evolved as a method for monitoring the future behavior of unfolding business processes and predicting Key Performance Indicators (KPIs). However, many PPM approaches often lack reproducibility, transparency in decision making, usability for incorporating novel datasets and benchmarking, making comparisons among different implementations very difficult. In this paper, we propose SPICE, a Python framework that reimplements three popular, existing baseline deep-learning-based methods for PPM in PyTorch, while designing a common base framework with rigorous configurability to enable reproducible and robust comparison of past and future modelling approaches. We compare SPICE to original reported metrics and with fair metrics on 11 datasets.