Determining Window Sizes using Species Estimation for Accurate Process Mining over Streams

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In streaming process mining, fixed-size sliding windows struggle to accommodate dynamic process evolution and concept drift, leading to model bias. To address this, we propose a dynamic window optimization method grounded in species estimation theory—introducing, for the first time, sample representativeness quantification into streaming process mining. Our approach establishes a real-time representativeness assessment model under sliding windows and adaptively adjusts window size to balance timeliness and statistical sufficiency. It requires no prior knowledge and enables online detection and response to concept drift. Experiments on multiple real-world event streams demonstrate that our method significantly improves process model accuracy (average +12.7% F1-score) and robustness to concept drift (38.5% reduction in false positive rate) compared to static-window baselines. This work establishes a novel paradigm for real-time, adaptive process analysis.

Technology Category

Application Category

📝 Abstract
Streaming process mining deals with the real-time analysis of event streams. A common approach for it is to adopt windowing mechanisms that select event data from a stream for subsequent analysis. However, the size of these windows denotes a crucial parameter, as it influences the representativeness of the window content and, by extension, of the analysis results. Given that process dynamics are subject to changes and potential concept drift, a static, fixed window size leads to inaccurate representations that introduce bias in the analysis. In this work, we present a novel approach for streaming process mining that addresses these limitations by adjusting window sizes. Specifically, we dynamically determine suitable window sizes based on estimators for the representativeness of samples as developed for species estimation in biodiversity research. Evaluation results on real-world data sets show improvements over existing approaches that adopt static window sizes in terms of accuracy and robustness to concept drifts.
Problem

Research questions and friction points this paper is trying to address.

Dynamically determining window sizes for streaming process mining
Addressing concept drift and bias from static window parameters
Using species estimation to improve accuracy and robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamically adjusts window sizes for streaming process mining
Uses species estimation from biodiversity for sample representativeness
Improves accuracy and robustness against concept drifts
🔎 Similar Papers
No similar papers found.
C
Christian Imenkamp
Business Informatics and Process Analytics, University of Bayreuth, Germany
Martin Kabierski
Martin Kabierski
PhD Student, Humboldt-Unviersität zu Berlin
Process MiningBusiness Process Management
H
Hendrik Reiter
Department of Computer Science, Christian-Albrechts-Universität zu Kiel
M
Matthias Weidlich
Department of Computer Science, Humboldt-Universität zu Berlin, Germany
Wilhelm Hasselbring
Wilhelm Hasselbring
Professor of Software Engineering, University of Kiel
Software Engineering
Agnes Koschmider
Agnes Koschmider
University of Bayreuth
Business Process ManagementEvent Log AnalysisPrivacyBehavior Modeling