Local Distance-Preserving Node Embeddings and Their Performance on Random Graphs

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This paper addresses the challenge of jointly preserving local similarity and modeling global distances in graph node embedding. We propose a landmark-based shortest-path approximation method that locally preserves graph distances in low-dimensional space. Theoretically, we prove that for random graphs—including Erdős-Rényi graphs—the required embedding dimension for landmark-based representations is significantly lower than the worst-case bound dictated by Bourgain-type metric embedding theory. Empirically, graph neural networks (GNNs) efficiently learn and generalize pairwise distances between landmarks, achieving high accuracy and strong scalability on large-scale graphs. Our key contributions are twofold: (1) establishing, for the first time, a theoretical connection between random graph structure and optimal embedding dimensionality; and (2) empirically validating the strong generalization capability of GNNs in learning landmark distances. This work introduces a novel, lightweight, and scalable paradigm for graph distance modeling.

Technology Category

Application Category

📝 Abstract

Learning node representations is a fundamental problem in graph machine learning. While existing embedding methods effectively preserve local similarity measures, they often fail to capture global functions like graph distances. Inspired by Bourgain's seminal work on Hilbert space embeddings of metric spaces (1985), we study the performance of local distance-preserving node embeddings. Known as landmark-based algorithms, these embeddings approximate pairwise distances by computing shortest paths from a small subset of reference nodes (i.e., landmarks). Our main theoretical contribution shows that random graphs, such as ErdH{o}s-R'enyi random graphs, require lower dimensions in landmark-based embeddings compared to worst-case graphs. Empirically, we demonstrate that the GNN-based approximations for the distances to landmarks generalize well to larger networks, offering a scalable alternative for graph representation learning.

Problem

Research questions and friction points this paper is trying to address.

Study local distance-preserving node embeddings for graphs

Compare landmark-based embeddings on random vs worst-case graphs

Evaluate GNN-based distance approximations for scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Landmark-based node embeddings preserve distances

Lower dimensions needed for random graphs

GNN approximates distances for scalability

🔎 Similar Papers

An Ad-hoc graph node vector embedding algorithm for general knowledge graphs using Kinetica-Graph