π€ AI Summary
This work addresses the challenge of high-fidelity digital twin reconstruction in automated valet parking, where sparse front-view observations, dynamic occlusions, and extreme lighting conditions severely degrade performance. The authors propose a lightweight, training-free online 3D reconstruction system that leverages OpenStreetMap (OSM) semantic topological priors to guide Truncated Signed Distance Function (TSDF) integration, enabling optimization-free geometrically consistent mapping. A novel four-modality (normal/height/depth) consistency constraint is introduced to filter dynamic objects in real time, while illumination-robust texture fusion is achieved through adaptive L-channel weighting in CIELAB space combined with depth-gradient suppression. Evaluated on a real-world 68,000 mΒ² parking environment, the system achieves an SSIM of 0.87 (a 16% improvement), operates 15Γ faster end-to-end, reduces GPU memory usage by 83.3%, and sustains over 30 FPS on a GTX 1660.
π Abstract
High-fidelity parking-lot digital twins provide essential priors for path planning, collision checking, and perception validation in Automated Valet Parking (AVP). Yet robot-oriented reconstruction faces a trilemma: sparse forward-facing views cause weak parallax and ill-posed geometry; dynamic occlusions and extreme lighting hinder stable texture fusion; and neural rendering typically needs expensive offline optimization, violating edge-side streaming constraints. We propose ParkingTwin, a training-free, lightweight system for online streaming 3D reconstruction. First, OSM-prior-driven geometric construction uses OpenStreetMap semantic topology to directly generate a metric-consistent TSDF, replacing blind geometric search with deterministic mapping and avoiding costly optimization. Second, geometry-aware dynamic filtering employs a quad-modal constraint field (normal/height/depth consistency) to reject moving vehicles and transient occlusions in real time. Third, illumination-robust fusion in CIELAB decouples luminance and chromaticity via adaptive L-channel weighting and depth-gradient suppression, reducing seams under abrupt lighting changes. ParkingTwin runs at 30+ FPS on an entry-level GTX 1660. On a 68,000 m^2 real-world dataset, it achieves SSIM 0.87 (+16.0%), delivers about 15x end-to-end speedup, and reduces GPU memory by 83.3% compared with state-of-the-art 3D Gaussian Splatting (3DGS) that typically requires high-end GPUs (RTX 4090D). The system outputs explicit triangle meshes compatible with Unity/Unreal digital-twin pipelines. Project page: https://mihoutao-liu.github.io/ParkingTwin/