GRAIL: Graph Edit Distance and Node Alignment Using LLM-Generated Code

📅 2025-05-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing three key challenges in graph edit distance (GED) computation—scarcity of ground-truth annotations, poor interpretability of neural approaches, and weak cross-domain generalization—this paper proposes GRAIL, the first LLM-driven program synthesis framework for GED solving. GRAIL abandons end-to-end neural fitting, instead leveraging automated prompt tuning and code generation to produce executable programs that compute GED and output node alignments—enabling zero-shot supervised training, full interpretability, and out-of-the-box cross-domain generalization without fine-tuning. Methodologically, it integrates LLM-based reasoning with classical graph matching and edit path search algorithms. Evaluated on seven benchmark datasets, GRAIL consistently surpasses state-of-the-art approximate methods in accuracy while natively supporting heterogeneous graph distributions.

Technology Category

Application Category

📝 Abstract
Graph Edit Distance (GED) is a widely used metric for measuring similarity between two graphs. Computing the optimal GED is NP-hard, leading to the development of various neural and non-neural heuristics. While neural methods have achieved improved approximation quality compared to non-neural approaches, they face significant challenges: (1) They require large amounts of ground truth data, which is itself NP-hard to compute. (2) They operate as black boxes, offering limited interpretability. (3) They lack cross-domain generalization, necessitating expensive retraining for each new dataset. We address these limitations with GRAIL, introducing a paradigm shift in this domain. Instead of training a neural model to predict GED, GRAIL employs a novel combination of large language models (LLMs) and automated prompt tuning to generate a program that is used to compute GED. This shift from predicting GED to generating programs imparts various advantages, including end-to-end interpretability and an autonomous self-evolutionary learning mechanism without ground-truth supervision. Extensive experiments on seven datasets confirm that GRAIL not only surpasses state-of-the-art GED approximation methods in prediction quality but also achieves robust cross-domain generalization across diverse graph distributions.
Problem

Research questions and friction points this paper is trying to address.

Computing optimal Graph Edit Distance is NP-hard
Neural methods lack interpretability and cross-domain generalization
GRAIL generates programs for GED without ground-truth data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs to generate GED computation code
Automated prompt tuning for program generation
Self-evolutionary learning without ground-truth data
🔎 Similar Papers
No similar papers found.
S
Samidha Verma
Yardi School of Artificial Intelligence, IIT Delhi, India
A
Arushi Goyal
Department of Computer Science and Engineering, IIT Delhi, India
A
Ananya Mathur
Department of Computer Science and Engineering, IIT Delhi, India
Ankit Anand
Ankit Anand
Research Scientist, Google DeepMind
Artificial IntelligenceMachine LearningAlgorithms
Sayan Ranu
Sayan Ranu
IIT Delhi
Machine learning for graphs