🤖 AI Summary
This work addresses the challenge of efficient shortest-path planning in scenarios where real-world trajectory data are scarce, by leveraging simulators that exhibit systematic biases. The authors propose a graph Laplacian-regularized bias estimation method that integrates limited real observations, abundant synthetic data, and edge similarity structures within the road network to model smooth simulator-to-reality discrepancies. They establish theoretical guarantees on path suboptimality and devise an active learning strategy applicable even in the absence of initial real-world data. Through finite-sample error analysis and experiments on road networks across multiple cities, the approach demonstrates its ability to closely approximate optimal paths with only a small amount of real data, while providing computable performance certificates.
📝 Abstract
Digital twins and other simulators are increasingly used to support routing decisions in large-scale networks. However, simulator outputs often exhibit systematic bias, while ground-truth measurements are costly and scarce. We study a stochastic shortest-path problem in which a planner has access to abundant synthetic samples, limited real-world observations, and an edge-similarity structure capturing expected behavioral similarity across links. We model the simulator-to-reality discrepancy as an unknown, edge-specific bias that varies smoothly over the similarity graph, and estimate it using Laplacian-regularized least squares. This approach yields calibrated edge cost estimates even in data-scarce regimes. We establish finite-sample error bounds, translate estimation error into path-level suboptimality guarantees, and propose a computable, data-driven certificate that verifies near-optimality of a candidate route. For cold-start settings without initial real data, we develop a bias-aware active learning algorithm that leverages the simulator and adaptively selects edges to measure until a prescribed accuracy is met. Numerical experiments on multiple road networks and traffic graphs further demonstrate the effectiveness of our methods.