🤖 AI Summary
This study addresses the challenges of accurately assessing license plate readability in complex real-world scenarios, where poor image quality, compression artifacts, and suboptimal camera installation hinder existing systems. To overcome limitations of current benchmarks—such as small scale and erroneous annotations—the authors construct a refined dataset three times larger than the original, featuring multi-dimensional annotations at the license plate, vehicle, and image levels, along with corrected labels. They propose a loss function based on exponential moving average and an improved learning rate scheduling strategy. Additionally, a novel evaluation protocol isolating camera contamination is introduced to mitigate train-test distribution shift. Experiments demonstrate that the proposed approach achieves an F1 score of 89.5% on the test set, significantly outperforming prior state-of-the-art methods and validating the effectiveness of both the enhanced dataset and algorithmic innovations.
📝 Abstract
Modern Automatic License Plate Recognition (ALPR) systems achieve outstanding performance in controlled, well-defined scenarios. However, large-scale real-world usage remains challenging due to low-quality imaging devices, compression artifacts, and suboptimal camera installation. Identifying illegible license plates (LPs) has recently become feasible through a dedicated benchmark; however, its impact has been limited by its small size and annotation errors. In this work, we expand the original benchmark to over three times the size with two extra capture days, revise its annotations and introduce novel labels. LP-level annotations include bounding boxes, text, and legibility level, while vehicle-level annotations comprise make, model, type, and color. Image-level annotations feature camera identity, capture conditions (e.g., rain and faulty cameras), acquisition time, and day ID. We present a novel training procedure featuring an Exponential Moving Average-based loss function and a refined learning rate scheduler, addressing common mistakes in testing. These improvements enable a baseline model to achieve an 89.5% F1-score on the test set, considerably surpassing the previous state of the art. We further introduce a novel protocol to explicitly addresses camera contamination between training and evaluation splits, where results show a small impact. Dataset and code are publicly available at https://github.com/lmlwojcik/LPLCv2-Dataset.