🤖 AI Summary
Missing attributes in event logs severely hinder process mining performance; existing repair methods either rely on prior process models or support only single-type attribute recovery. This paper proposes the first end-to-end event log repair framework based on heterogeneous graph neural networks (HGNNs): it models execution traces as heterogeneous graphs, integrating multimodal nodes and relations—including activities, resources, and timestamps—and performs joint reconstruction of missing attributes via message passing. The approach is model-agnostic and simultaneously repairs both structured and unstructured attributes. Experiments on two synthetic and four real-world event logs demonstrate that our method consistently outperforms state-of-the-art autoencoder-based approaches across diverse missingness patterns, achieving superior reconstruction accuracy and generalization capability.
📝 Abstract
The quality of event logs in Process Mining is crucial when applying any form of analysis to them. In real-world event logs, the acquisition of data can be non-trivial (e.g., due to the execution of manual activities and related manual recording or to issues in collecting, for each event, all its attributes), and often may end up with events recorded with some missing information. Standard approaches to the problem of trace (or log) reconstruction either require the availability of a process model that is used to fill missing values by leveraging different reasoning techniques or employ a Machine Learning/Deep Learning model to restore the missing values by learning from similar cases. In recent years, a new type of Deep Learning model that is capable of handling input data encoded as graphs has emerged, namely Graph Neural Networks. Graph Neural Network models, and even more so Heterogeneous Graph Neural Networks, offer the advantage of working with a more natural representation of complex multi-modal sequences like the execution traces in Process Mining, allowing for more expressive and semantically rich encodings.
In this work, we focus on the development of a Heterogeneous Graph Neural Network model that, given a trace containing some incomplete events, will return the full set of attributes missing from those events. We evaluate our work against a state-of-the-art approach leveraging autoencoders on two synthetic logs and four real event logs, on different types of missing values. Different from state-of-the-art model-free approaches, which mainly focus on repairing a subset of event attributes, the proposed approach shows very good performance in reconstructing all different event attributes.