An Empirical Survey and Benchmark of Learned Distance Indexes for Road Networks

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Existing shortest-path distance computation methods suffer from excessive latency in large-scale, real-time road networks, while machine learning–based distance indexing approaches lack systematic evaluation. This work presents the first comprehensive empirical study of ten representative learning-based distance indexing methods across seven real-world road networks and trajectory-driven query sets. We conduct a thorough comparison against classical non-ML baselines along four key dimensions: training time, query latency, storage overhead, and accuracy. To facilitate reproducibility and future research, we develop and publicly release a unified evaluation framework. Our analysis reveals critical trade-offs between efficiency and accuracy across different methods and identifies their respective suitability for various application scenarios, thereby establishing a reproducible benchmark and offering practical guidance for both academic research and industrial deployment.

Technology Category

Application Category

📝 Abstract

The calculation of shortest-path distances in road networks is a core operation in navigation systems, location-based services, and spatial analytics. Although classical algorithms, e.g., Dijkstra's algorithm, provide exact answers, their latency is prohibitive for modern real-time, large-scale deployments. Over the past two decades, numerous distance indexes have been proposed to speed up query processing for shortest distance queries. More recently, with the advancement in machine learning (ML), researchers have designed and proposed ML-based distance indexes to answer approximate shortest path and distance queries efficiently. However, a comprehensive and systematic evaluation of these ML-based approaches is lacking. This paper presents the first empirical survey of ML-based distance indexes on road networks, evaluating them along four key dimensions: Training time, query latency, storage, and accuracy. Using seven real-world road networks and workload-driven query datasets derived from trajectory data, we benchmark ten representative ML techniques and compare them against strong classical non-ML baselines, highlighting key insights and practical trade-offs. We release a unified open-source codebase to support reproducibility and future research on learned distance indexes.

Problem

Research questions and friction points this paper is trying to address.

learned distance indexes

road networks

shortest-path distance

empirical evaluation

machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

learned index

distance estimation

road networks