All Eyes on the Workflow: Automated and Efficient Event Discovery from Video Streams

📅 2026-04-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

189K/year
🤖 AI Summary
This study addresses the challenge of generating structured event logs from multimodal data such as videos to support business process mining. The authors propose an end-to-end approach that first maps video frames into feature vectors using image embeddings, then performs temporal segmentation via an inter-frame similarity matrix. Subsequently, a generalized few-shot classification method automatically assigns semantic labels to the resulting segments, yielding a timestamped, structured event sequence. This work represents the first integration of image embeddings with few-shot learning for the automatic transformation of raw video into process-mining-ready event logs, thereby overcoming the traditional reliance on pre-structured input data. The method’s effectiveness and practicality are validated through experiments in real-world scenarios.

Technology Category

Application Category

📝 Abstract
Disciplines such as business process management and process mining aid organizations by discovering insights about processes on the basis of recorded event data. However, an obstacle to process analysis is data multi-modality: for instance, data in video form are not directly interpretable as events. In this work, we present SnapLog, an approach to extract event data from videos by converting frames to feature vectors using image embeddings and performing temporal segmentation through frame-wise similarity matrices. A generalized few-shot classification is then used to assign labels to the video segments, yielding labeled, timestamped sub-sequences of frames that are interpretable as events. Conventional process mining techniques can be used to analyze the resulting data. We show that our approach produces logs that accurately reflect the process in the videos.
Problem

Research questions and friction points this paper is trying to address.

event discovery
video streams
process mining
data multi-modality
event data extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

video-to-event extraction
image embeddings
temporal segmentation
few-shot classification
process mining
🔎 Similar Papers
No similar papers found.