SHAining on Process Mining: Explaining Event Log Characteristics Impact on Algorithms

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing studies largely overlook the interdependencies among structural characteristics of event logs, hindering the isolation of their individual effects on process mining algorithm performance. Method: We propose SHAining—a marginal contribution analysis framework—that systematically quantifies the independent impact of key structural features (e.g., noise ratio, behavioral variability, completeness) on adaptability, precision, and model complexity across over 22,000 real and synthetic logs. Unlike conventional causal assumptions, SHAining explicitly models feature interaction effects, integrating statistical modeling with interpretable analysis in process discovery. Contribution/Results: SHAining reveals the magnitude and nonlinear patterns of feature influence, identifying the most impactful core features. It further enables robustness assessment of algorithms under varying structural conditions. The findings provide empirically grounded guidance for event log preprocessing, algorithm selection, and performance optimization—advancing both theoretical understanding and practical deployment of process mining techniques.

Technology Category

Application Category

📝 Abstract
Process mining aims to extract and analyze insights from event logs, yet algorithm metric results vary widely depending on structural event log characteristics. Existing work often evaluates algorithms on a fixed set of real-world event logs but lacks a systematic analysis of how event log characteristics impact algorithms individually. Moreover, since event logs are generated from processes, where characteristics co-occur, we focus on associational rather than causal effects to assess how strong the overlapping individual characteristic affects evaluation metrics without assuming isolated causal effects, a factor often neglected by prior work. We introduce SHAining, the first approach to quantify the marginal contribution of varying event log characteristics to process mining algorithms' metrics. Using process discovery as a downstream task, we analyze over 22,000 event logs covering a wide span of characteristics to uncover which affect algorithms across metrics (e.g., fitness, precision, complexity) the most. Furthermore, we offer novel insights about how the value of event log characteristics correlates with their contributed impact, assessing the algorithm's robustness.
Problem

Research questions and friction points this paper is trying to address.

Analyzing how event log characteristics impact process mining algorithms
Quantifying marginal contributions of event log features to algorithm metrics
Assessing algorithm robustness across diverse event log characteristics
Innovation

Methods, ideas, or system contributions that make the work stand out.

SHAining quantifies event log characteristics' marginal contributions
Analyzes 22000+ logs to identify key algorithm metric impacts
Assesses algorithm robustness via characteristic value correlations
🔎 Similar Papers
No similar papers found.