🤖 AI Summary
This work addresses the challenge of scale ambiguity in monocular visual odometry, which often leads to insufficient metric accuracy. The authors propose an unsupervised method that requires neither training data nor expensive sensors, leveraging the standardized geometric layout of U.S. license plates as a novel geometric prior to enable metric-scale distance estimation. The system integrates parallel quadruple license plate detection, a three-stage state verification pipeline (combining OCR matching, color scoring, and a lightweight neural network), and inverse-variance-weighted deep fusion, followed by a one-dimensional constant-velocity Kalman filter to produce smooth estimates of distance, relative velocity, and collision warnings. Experiments demonstrate an average absolute error of 2.3% at 10 meters, a 36% reduction in distance variance compared to the conventional plate-width method, and relative error five times lower than deep learning baselines, while maintaining robust and continuous output even under brief occlusions.
📝 Abstract
Accurate inter-vehicle distance estimation is a cornerstone of Advanced Driver Assistance Systems (ADAS) and autonomous driving. While LiDAR and radar provide high precision, their high cost prohibits widespread adoption in mass-market vehicles. Monocular camera-based estimation offers a low-cost alternative but suffers from fundamental scale ambiguity. Recent deep learning methods for monocular depth achieve impressive results yet require expensive supervised training, suffer from domain shift, and produce predictions that are difficult to certify for safety-critical deployment. This paper presents a framework that exploits the standardized typography of United States license plates as passive fiducial markers for metric ranging, resolving scale ambiguity through explicit geometric priors without any training data or active illumination. First, a four-method parallel plate detector achieves robust plate reading across the full automotive lighting range. Second, a three-stage state identification engine fusing OCR text matching, multi-design color scoring, and a lightweight neural network classifier provides robust identification across all ambient conditions. Third, hybrid depth fusion with inverse-variance weighting and online scale alignment, combined with a one-dimensional constant-velocity Kalman filter, delivers smoothed distance, relative velocity, and time-to-collision for collision warning. Baseline validation reproduces a 2.3% coefficient of variation in character height measurements and a 36% reduction in distance-estimate variance compared with plate-width methods from prior work. Extensive outdoor experiments confirm a mean absolute error of 2.3% at 10 m and continuous distance output during brief plate occlusions, outperforming deep learning baselines by a factor of five in relative error.