Determining Window Sizes using Species Estimation for Accurate Process Mining over Streams

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

In streaming process mining, fixed-size sliding windows struggle to accommodate dynamic process evolution and concept drift, leading to model bias. To address this, we propose a dynamic window optimization method grounded in species estimation theory—introducing, for the first time, sample representativeness quantification into streaming process mining. Our approach establishes a real-time representativeness assessment model under sliding windows and adaptively adjusts window size to balance timeliness and statistical sufficiency. It requires no prior knowledge and enables online detection and response to concept drift. Experiments on multiple real-world event streams demonstrate that our method significantly improves process model accuracy (average +12.7% F1-score) and robustness to concept drift (38.5% reduction in false positive rate) compared to static-window baselines. This work establishes a novel paradigm for real-time, adaptive process analysis.

Technology Category

Application Category

📝 Abstract

Streaming process mining deals with the real-time analysis of event streams. A common approach for it is to adopt windowing mechanisms that select event data from a stream for subsequent analysis. However, the size of these windows denotes a crucial parameter, as it influences the representativeness of the window content and, by extension, of the analysis results. Given that process dynamics are subject to changes and potential concept drift, a static, fixed window size leads to inaccurate representations that introduce bias in the analysis. In this work, we present a novel approach for streaming process mining that addresses these limitations by adjusting window sizes. Specifically, we dynamically determine suitable window sizes based on estimators for the representativeness of samples as developed for species estimation in biodiversity research. Evaluation results on real-world data sets show improvements over existing approaches that adopt static window sizes in terms of accuracy and robustness to concept drifts.

Problem

Research questions and friction points this paper is trying to address.

Dynamically determining window sizes for streaming process mining

Addressing concept drift and bias from static window parameters

Using species estimation to improve accuracy and robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamically adjusts window sizes for streaming process mining

Uses species estimation from biodiversity for sample representativeness

Improves accuracy and robustness against concept drifts

🔎 Similar Papers

No similar papers found.