Graph Repairs with Large Language Models: An Empirical Study

📅 2025-07-04

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

Attribute graphs are widely deployed in domains such as healthcare and finance, yet frequently suffer from data inconsistencies, missing values, and schema violations. Traditional rule-based repair techniques lack generalizability, while interactive human-in-the-loop approaches scale poorly. This paper presents the first systematic evaluation of six open-source large language models (LLMs) for automated attribute graph repair. We propose a novel repair paradigm integrating contextual reasoning with external knowledge, enabled by prompt engineering and an empirical evaluation framework that leverages LLMs’ internalized knowledge for end-to-end error detection and correction. Experimental results demonstrate that LLMs exhibit substantial potential for graph repair, with clear trade-offs among accuracy, efficiency, and scalability across models. Our work delivers a reproducible, fully automated solution for enhancing graph data quality and elucidates the applicability boundaries and optimization avenues for LLMs in structured data repair.

Technology Category

Application Category

📝 Abstract

Property graphs are widely used in domains such as healthcare, finance, and social networks, but they often contain errors due to inconsistencies, missing data, or schema violations. Traditional rule-based and heuristic-driven graph repair methods are limited in their adaptability as they need to be tailored for each dataset. On the other hand, interactive human-in-the-loop approaches may become infeasible when dealing with large graphs, as the cost--both in terms of time and effort--of involving users becomes too high. Recent advancements in Large Language Models (LLMs) present new opportunities for automated graph repair by leveraging contextual reasoning and their access to real-world knowledge. We evaluate the effectiveness of six open-source LLMs in repairing property graphs. We assess repair quality, computational cost, and model-specific performance. Our experiments show that LLMs have the potential to detect and correct errors, with varying degrees of accuracy and efficiency. We discuss the strengths, limitations, and challenges of LLM-driven graph repair and outline future research directions for improving scalability and interpretability.

Problem

Research questions and friction points this paper is trying to address.

Detect and correct errors in property graphs using LLMs

Overcome limitations of rule-based and human-involved graph repair

Evaluate LLM performance in repair quality and computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging LLMs for automated graph repair

Evaluating six open-source LLMs' repair performance

Assessing repair quality and computational costs

🔎 Similar Papers

Revisiting the Graph Reasoning Ability of Large Language Models: Case Studies in Translation, Connectivity and Shortest Path