FlowExtract: Procedural Knowledge Extraction from Maintenance Flowcharts

📅 2026-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of unlocking procedural knowledge embedded in static ISO 5807 standard flowcharts—commonly stored as PDFs or scanned images in manufacturing facilities—for use by modern intelligent systems. To this end, the authors propose FlowExtract, a structured knowledge extraction pipeline that decouples node detection from edge reconstruction. The approach leverages YOLOv8 for node detection and EasyOCR for text recognition, and introduces a novel integration of arrow orientation analysis with backward connection tracing to achieve high-precision edge relationship recovery. Experimental results on an industrial troubleshooting guide dataset demonstrate near-perfect node detection accuracy and significantly outperform vision-language model baselines in edge extraction. This work presents the first method capable of producing queryable, structured representations of procedural knowledge from industrial flowcharts.
📝 Abstract
Maintenance procedures in manufacturing facilities are often documented as flowcharts in static PDFs or scanned images. They encode procedural knowledge essential for asset lifecycle management, yet inaccessible to modern operator support systems. Vision-language models, the dominant paradigm for image understanding, struggle to reconstruct connection topology from such diagrams. We present FlowExtract, a pipeline for extracting directed graphs from ISO 5807-standardized flowcharts. The system separates element detection from connectivity reconstruction, using YOLOv8 and EasyOCR for standard domain-aligned node detection and text extraction, combined with a novel edge detection method that analyzes arrowhead orientations and traces connecting lines backward to source nodes. Evaluated on industrial troubleshooting guides, FlowExtract achieves very high node detection and substantially outperforms vision-language model baselines on edge extraction, offering organizations a practical path toward queryable procedural knowledge representations. The implementation is available athttps://github.com/guille-gil/FlowExtract.
Problem

Research questions and friction points this paper is trying to address.

procedural knowledge extraction
maintenance flowcharts
connection topology
vision-language models
ISO 5807
Innovation

Methods, ideas, or system contributions that make the work stand out.

procedural knowledge extraction
flowchart understanding
directed graph reconstruction
arrowhead-based edge detection
YOLOv8 and EasyOCR integration
🔎 Similar Papers
No similar papers found.
G
Guillermo Gil de Avalle
University of Groningen, Nettelbosje 2, Groningen, The Netherlands
Laura Maruster
Laura Maruster
Assistant professor at University of Groningen, The Netherlands
process modelingprocess mining
E
Eric Sloot
Philips Consumer Lifestyle B.V., Oliemolenstraat 5, Drachten, The Netherlands
Christos Emmanouilidis
Christos Emmanouilidis
Associate Professor, University of Groningen
Human in the LoopHuman-Centric AIIoTInformation SystemsAsset Lifecycle Management