🤖 AI Summary
To address challenges in industrial panel monitoring—including ambiguous localization of dense, small-scale text, cross-scale recognition difficulty, and boundary ambiguity—this paper proposes a multi-scale Transformer architecture integrating Catmull-Rom spline-guided feature sampling. The method employs a lightweight hybrid attention mechanism, explicit multi-layer fusion of spatial-semantic features encoding text shape, position, and semantics, and edge-deployment-oriented optimizations. We introduce IPM, the first benchmark dataset for industrial panel monitoring. On IPM, our approach reduces false-negative rate by 32% and recognition error rate by 27% over state-of-the-art methods. Deployed on a custom edge vision system (NVIDIA Jetson Orin), it achieves 23 FPS. Key innovations include spline-driven geometric-aware feature sampling and an edge-efficient, multi-scale joint modeling framework.
📝 Abstract
Text spotting for industrial panels is a key task for intelligent monitoring. However, achieving efficient and accurate text spotting for complex industrial panels remains challenging due to issues such as cross-scale localization and ambiguous boundaries in dense text regions. Moreover, most existing methods primarily focus on representing a single text shape, neglecting a comprehensive exploration of multi-scale feature information across different texts. To address these issues, this work proposes a novel multi-scale dense text spotter for edge AI-based vision system (EdgeSpotter) to achieve accurate and robust industrial panel monitoring. Specifically, a novel Transformer with efficient mixer is developed to learn the interdependencies among multi-level features, integrating multi-layer spatial and semantic cues. In addition, a new feature sampling with catmull-rom splines is designed, which explicitly encodes the shape, position, and semantic information of text, thereby alleviating missed detections and reducing recognition errors caused by multi-scale or dense text regions. Furthermore, a new benchmark dataset for industrial panel monitoring (IPM) is constructed. Extensive qualitative and quantitative evaluations on this challenging benchmark dataset validate the superior performance of the proposed method in different challenging panel monitoring tasks. Finally, practical tests based on the self-designed edge AI-based vision system demonstrate the practicality of the method. The code and demo will be available at https://github.com/vision4robotics/EdgeSpotter.