🤖 AI Summary
To address compliance auditing requirements in agri-food processing facilities—where manual quality inspection suffers from human error and RGB-only vision fails under occlusion or low-illumination conditions—this paper proposes an unsupervised, edge-deployable multimodal object tracking framework. It fuses 3D time-of-flight (ToF) and RGB modalities, leveraging joint motion-geometric modeling and cross-modal feature alignment to achieve zero-shot, continuous, real-time detection and tracking without labeled data. Deployed on a Jetson edge platform, the system achieves >98.2% tracking accuracy and <120 ms end-to-end latency in a knife disinfection monitoring scenario; manual audit effort is reduced by 76%. Long-term deployment on real production lines validates its operational robustness. This work introduces the first edge-native, unsupervised multimodal tracking paradigm, significantly enhancing quality control rigor and process traceability in industrial food safety applications.
📝 Abstract
Regulatory compliance auditing in agrifood processing facilities is crucial for upholding the highest standards of quality assurance and traceability. However, the current manual and intermittent approaches to auditing present significant challenges and risks, potentially leading to gaps or loopholes in the system. To address these shortcomings, we introduce a real-time, multi-modal sensing system that utilizes 3D time-of-flight and RGB cameras and leverages unsupervised learning techniques on edge AI devices. The proposed system enables continuous object tracking, leading to improved efficiency in record-keeping and reduced manual labor. We demonstrate the effectiveness of the system in a knife sanitization monitoring scenario, showcasing its capability to overcome occlusion and low-light performance limitations commonly encountered with conventional RGB cameras.