AI-driven visual monitoring of industrial assembly tasks

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Industrial visual monitoring for assembly tasks faces significant challenges in robust action recognition under marker-free, non-rigid environments; existing approaches rely on fixed workstations or explicit visual markers. This paper proposes ViMAT, the first end-to-end system enabling real-time assembly action recognition without markers or fixed workstation constraints. ViMAT enhances perceptual robustness via multi-view video feature extraction and temporal state modeling; integrates neural perception with symbolic prior knowledge reasoning to address partial observability and visual uncertainty; and employs a lightweight architecture for real-time inference. Evaluated on two real-world production-line tasks—LEGO component replacement and hydraulic press die reconfiguration—ViMAT outperforms mainstream baselines by an average accuracy gain of 12.6%, demonstrating strong practicality and generalization capability in complex industrial settings.

Technology Category

Application Category

📝 Abstract
Visual monitoring of industrial assembly tasks is critical for preventing equipment damage due to procedural errors and ensuring worker safety. Although commercial solutions exist, they typically require rigid workspace setups or the application of visual markers to simplify the problem. We introduce ViMAT, a novel AI-driven system for real-time visual monitoring of assembly tasks that operates without these constraints. ViMAT combines a perception module that extracts visual observations from multi-view video streams with a reasoning module that infers the most likely action being performed based on the observed assembly state and prior task knowledge. We validate ViMAT on two assembly tasks, involving the replacement of LEGO components and the reconfiguration of hydraulic press molds, demonstrating its effectiveness through quantitative and qualitative analysis in challenging real-world scenarios characterized by partial and uncertain visual observations. Project page: https://tev-fbk.github.io/ViMAT
Problem

Research questions and friction points this paper is trying to address.

AI-driven visual monitoring of industrial assembly tasks
Eliminates need for rigid setups or visual markers
Handles partial and uncertain visual observations effectively
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-driven real-time visual monitoring system
Multi-view video streams for perception
Reasoning module for action inference
🔎 Similar Papers
No similar papers found.
M
Mattia Nardon
Fondazione Bruno Kessler, Trento, Italy
S
S. Messelodi
Fondazione Bruno Kessler, Trento, Italy
A
Antonio Granata
Meccanica del Sarca s.p.a., Trento, Italy
Fabio Poiesi
Fabio Poiesi
Fondazione Bruno Kessler
Computer Vision
A
Alberto Danese
Meccanica del Sarca s.p.a., Trento, Italy
Davide Boscaini
Davide Boscaini
Fondazione Bruno Kessler
Geometric Deep LearningComputer Vision