LLMTM: Benchmarking and Optimizing LLMs for Temporal Motif Analysis in Dynamic Graphs

📅 2025-12-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
Prior work has not systematically investigated large language models’ (LLMs) capabilities in temporal motif analysis on dynamic graphs, nor established dedicated benchmarks or optimization methods for this task. Method: We introduce LLMTM—the first LLM-specific benchmark for dynamic graph temporal motif analysis—comprising nine motif types and six analytical tasks. We further propose a tool-augmented LLM agent to enhance analytical accuracy and a structure-aware scheduler to reduce inference overhead while preserving high accuracy. Contribution/Results: Extensive evaluation of nine state-of-the-art LLMs (e.g., Qwen, DeepSeek, GPT-4o-mini) on LLMTM demonstrates that our agent achieves SOTA accuracy; the scheduler reduces average inference cost by 42.3% with negligible accuracy degradation (<1.2%). This work establishes the first LLM evaluation paradigm for dynamic graph temporal motif analysis and presents a performance-efficiency co-optimization framework.

Technology Category

Application Category

📝 Abstract
The widespread application of Large Language Models (LLMs) has motivated a growing interest in their capacity for processing dynamic graphs. Temporal motifs, as an elementary unit and important local property of dynamic graphs which can directly reflect anomalies and unique phenomena, are essential for understanding their evolutionary dynamics and structural features. However, leveraging LLMs for temporal motif analysis on dynamic graphs remains relatively unexplored. In this paper, we systematically study LLM performance on temporal motif-related tasks. Specifically, we propose a comprehensive benchmark, LLMTM (Large Language Models in Temporal Motifs), which includes six tailored tasks across nine temporal motif types. We then conduct extensive experiments to analyze the impacts of different prompting techniques and LLMs (including nine models: openPangu-7B, the DeepSeek-R1-Distill-Qwen series, Qwen2.5-32B-Instruct, GPT-4o-mini, DeepSeek-R1, and o3) on model performance. Informed by our benchmark findings, we develop a tool-augmented LLM agent that leverages precisely engineered prompts to solve these tasks with high accuracy. Nevertheless, the high accuracy of the agent incurs a substantial cost. To address this trade-off, we propose a simple yet effective structure-aware dispatcher that considers both the dynamic graph's structural properties and the LLM's cognitive load to intelligently dispatch queries between the standard LLM prompting and the more powerful agent. Our experiments demonstrate that the structure-aware dispatcher effectively maintains high accuracy while reducing cost.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking LLMs for temporal motif analysis in dynamic graphs
Developing a tool-augmented LLM agent for accurate motif tasks
Proposing a cost-effective dispatcher to balance accuracy and expense
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking LLMs for temporal motif tasks
Developing tool-augmented LLM agent for accuracy
Using structure-aware dispatcher to balance cost and performance