EdgeSpotter: Multi-Scale Dense Text Spotting for Industrial Panel Monitoring

📅 2025-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address challenges in industrial panel monitoring—including ambiguous localization of dense, small-scale text, cross-scale recognition difficulty, and boundary ambiguity—this paper proposes a multi-scale Transformer architecture integrating Catmull-Rom spline-guided feature sampling. The method employs a lightweight hybrid attention mechanism, explicit multi-layer fusion of spatial-semantic features encoding text shape, position, and semantics, and edge-deployment-oriented optimizations. We introduce IPM, the first benchmark dataset for industrial panel monitoring. On IPM, our approach reduces false-negative rate by 32% and recognition error rate by 27% over state-of-the-art methods. Deployed on a custom edge vision system (NVIDIA Jetson Orin), it achieves 23 FPS. Key innovations include spline-driven geometric-aware feature sampling and an edge-efficient, multi-scale joint modeling framework.

Technology Category

Application Category

📝 Abstract
Text spotting for industrial panels is a key task for intelligent monitoring. However, achieving efficient and accurate text spotting for complex industrial panels remains challenging due to issues such as cross-scale localization and ambiguous boundaries in dense text regions. Moreover, most existing methods primarily focus on representing a single text shape, neglecting a comprehensive exploration of multi-scale feature information across different texts. To address these issues, this work proposes a novel multi-scale dense text spotter for edge AI-based vision system (EdgeSpotter) to achieve accurate and robust industrial panel monitoring. Specifically, a novel Transformer with efficient mixer is developed to learn the interdependencies among multi-level features, integrating multi-layer spatial and semantic cues. In addition, a new feature sampling with catmull-rom splines is designed, which explicitly encodes the shape, position, and semantic information of text, thereby alleviating missed detections and reducing recognition errors caused by multi-scale or dense text regions. Furthermore, a new benchmark dataset for industrial panel monitoring (IPM) is constructed. Extensive qualitative and quantitative evaluations on this challenging benchmark dataset validate the superior performance of the proposed method in different challenging panel monitoring tasks. Finally, practical tests based on the self-designed edge AI-based vision system demonstrate the practicality of the method. The code and demo will be available at https://github.com/vision4robotics/EdgeSpotter.
Problem

Research questions and friction points this paper is trying to address.

Achieving efficient text spotting in complex industrial panels
Addressing cross-scale localization and dense text boundary issues
Enhancing multi-scale feature integration for accurate text detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer with efficient mixer for multi-level features
Feature sampling using catmull-rom splines for text encoding
Edge AI-based vision system for industrial monitoring
🔎 Similar Papers
No similar papers found.
C
Changhong Fu
School of Mechanical Engineering, Tongji University, Shanghai 201804, China; Shanghai Key Laboratory of Wearable Robotics and Human-Machine Interaction, Tongji University, Shanghai 201804, China
H
Hua Lin
School of Mechanical Engineering, Tongji University, Shanghai 201804, China
Haobo Zuo
Haobo Zuo
University of Hong Kong
Computer VisionObject TrackingRoboticsMachine Learning
L
Liangliang Yao
School of Mechanical Engineering, Tongji University, Shanghai 201804, China
Liguo Zhang
Liguo Zhang
School of Mechanical Engineering, Tongji University, Shanghai 201804, China; Department of NANO Fabrication, Institute of Nano-Tech and Nano-Bionics (SINANO), Chinese Academy of Sciences, Suzhou 215123, China