Learning Shortest Paths When Data is Scarce

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the challenge of efficient shortest-path planning in scenarios where real-world trajectory data are scarce, by leveraging simulators that exhibit systematic biases. The authors propose a graph Laplacian-regularized bias estimation method that integrates limited real observations, abundant synthetic data, and edge similarity structures within the road network to model smooth simulator-to-reality discrepancies. They establish theoretical guarantees on path suboptimality and devise an active learning strategy applicable even in the absence of initial real-world data. Through finite-sample error analysis and experiments on road networks across multiple cities, the approach demonstrates its ability to closely approximate optimal paths with only a small amount of real data, while providing computable performance certificates.

Technology Category

Application Category

📝 Abstract

Digital twins and other simulators are increasingly used to support routing decisions in large-scale networks. However, simulator outputs often exhibit systematic bias, while ground-truth measurements are costly and scarce. We study a stochastic shortest-path problem in which a planner has access to abundant synthetic samples, limited real-world observations, and an edge-similarity structure capturing expected behavioral similarity across links. We model the simulator-to-reality discrepancy as an unknown, edge-specific bias that varies smoothly over the similarity graph, and estimate it using Laplacian-regularized least squares. This approach yields calibrated edge cost estimates even in data-scarce regimes. We establish finite-sample error bounds, translate estimation error into path-level suboptimality guarantees, and propose a computable, data-driven certificate that verifies near-optimality of a candidate route. For cold-start settings without initial real data, we develop a bias-aware active learning algorithm that leverages the simulator and adaptively selects edges to measure until a prescribed accuracy is met. Numerical experiments on multiple road networks and traffic graphs further demonstrate the effectiveness of our methods.

Problem

Research questions and friction points this paper is trying to address.

shortest path

data scarcity

simulator bias

edge cost estimation

digital twin

Innovation

Methods, ideas, or system contributions that make the work stand out.

Laplacian regularization

simulator-to-reality bias

active learning