Edge Approximation Text Detector

πŸ“… 2025-04-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Irregular scene text detection suffers from coarse contour representation and complex reconstruction pipelines. To address these issues, this paper proposes EdgeText, which models text contours as a continuous, smooth parametric edge curve fitting problem. EdgeText achieves compact and efficient reconstruction through three core components: center-point localization, edge function generation, and truncation point prediction. It is the first method to formalize text detection as an edge approximation task. We introduce a Bilateral Enhancement Perception (BEP) module to strengthen edge-aware feature representation, and propose a Proportional-Integral loss (PI-loss) to improve parameter convergence and multi-scale robustness of the fitted curves. Extensive experiments demonstrate that EdgeText achieves state-of-the-art performance on multiple benchmarks, significantly improving detection accuracy for irregular text while simplifying the reconstruction pipeline and reducing model complexity.

Technology Category

Application Category

πŸ“ Abstract
Pursuing efficient text shape representations helps scene text detection models focus on compact foreground regions and optimize the contour reconstruction steps to simplify the whole detection pipeline. Current approaches either represent irregular shapes via box-to-polygon strategy or decomposing a contour into pieces for fitting gradually, the deficiency of coarse contours or complex pipelines always exists in these models. Considering the above issues, we introduce EdgeText to fit text contours compactly while alleviating excessive contour rebuilding processes. Concretely, it is observed that the two long edges of texts can be regarded as smooth curves. It allows us to build contours via continuous and smooth edges that cover text regions tightly instead of fitting piecewise, which helps avoid the two limitations in current models. Inspired by this observation, EdgeText formulates the text representation as the edge approximation problem via parameterized curve fitting functions. In the inference stage, our model starts with locating text centers, and then creating curve functions for approximating text edges relying on the points. Meanwhile, truncation points are determined based on the location features. In the end, extracting curve segments from curve functions by using the pixel coordinate information brought by truncation points to reconstruct text contours. Furthermore, considering the deep dependency of EdgeText on text edges, a bilateral enhanced perception (BEP) module is designed. It encourages our model to pay attention to the recognition of edge features. Additionally, to accelerate the learning of the curve function parameters, we introduce a proportional integral loss (PI-loss) to force the proposed model to focus on the curve distribution and avoid being disturbed by text scales.
Problem

Research questions and friction points this paper is trying to address.

Efficient text shape representation for scene text detection
Compact contour reconstruction to simplify detection pipeline
Parameterized curve fitting for accurate text edge approximation
Innovation

Methods, ideas, or system contributions that make the work stand out.

EdgeText fits text contours via smooth curves
Bilateral enhanced perception module enhances edge recognition
Proportional integral loss accelerates curve parameter learning
πŸ”Ž Similar Papers
No similar papers found.
Chuang Yang
Chuang Yang
Woven City; Alumnus@SUSTech & UTokyo
Spatio-temporal Data MiningHuman MobilityData Visualization
X
Xu Han
School of Artificial Intelligence, OPtics and ElectroNies (iOPEN), Northwestern Polytechnical University, Xi’an 710072, P.R. China
T
Tao Han
Shanghai Artificial Intelligence Laboratory, Longwen Road 129, Xuhui District, 200232 Shanghai, China
Han Han
Han Han
LS2N
Musical Information RetrievalAcoustics
B
Bingxuan Zhao
School of Artificial Intelligence, OPtics and ElectroNies (iOPEN), Northwestern Polytechnical University, Xi’an 710072, P.R. China
Q
Qi Wang
School of Artificial Intelligence, OPtics and ElectroNies (iOPEN), Northwestern Polytechnical University, Xi’an 710072, P.R. China