Learning Shortest Paths When Data is Scarce

📅 2026-01-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficient shortest-path planning in scenarios where real-world trajectory data are scarce, by leveraging simulators that exhibit systematic biases. The authors propose a graph Laplacian-regularized bias estimation method that integrates limited real observations, abundant synthetic data, and edge similarity structures within the road network to model smooth simulator-to-reality discrepancies. They establish theoretical guarantees on path suboptimality and devise an active learning strategy applicable even in the absence of initial real-world data. Through finite-sample error analysis and experiments on road networks across multiple cities, the approach demonstrates its ability to closely approximate optimal paths with only a small amount of real data, while providing computable performance certificates.

Technology Category

Application Category

📝 Abstract
Digital twins and other simulators are increasingly used to support routing decisions in large-scale networks. However, simulator outputs often exhibit systematic bias, while ground-truth measurements are costly and scarce. We study a stochastic shortest-path problem in which a planner has access to abundant synthetic samples, limited real-world observations, and an edge-similarity structure capturing expected behavioral similarity across links. We model the simulator-to-reality discrepancy as an unknown, edge-specific bias that varies smoothly over the similarity graph, and estimate it using Laplacian-regularized least squares. This approach yields calibrated edge cost estimates even in data-scarce regimes. We establish finite-sample error bounds, translate estimation error into path-level suboptimality guarantees, and propose a computable, data-driven certificate that verifies near-optimality of a candidate route. For cold-start settings without initial real data, we develop a bias-aware active learning algorithm that leverages the simulator and adaptively selects edges to measure until a prescribed accuracy is met. Numerical experiments on multiple road networks and traffic graphs further demonstrate the effectiveness of our methods.
Problem

Research questions and friction points this paper is trying to address.

shortest path
data scarcity
simulator bias
edge cost estimation
digital twin
Innovation

Methods, ideas, or system contributions that make the work stand out.

Laplacian regularization
simulator-to-reality bias
active learning
stochastic shortest path
data-scarce routing