TAAF: A Trace Abstraction and Analysis Framework Synergizing Knowledge Graphs and LLMs

πŸ“… 2026-01-06
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of analyzing massive execution traces generated by large-scale systems such as operating system kernels, Chrome, and MySQL, which are difficult to interpret using existing tools that rely on predefined methods or error-prone, labor-intensive domain-specific scripts. The paper proposes TAAF, a novel framework that integrates temporal-indexed knowledge graphs with large language models (LLMs) to enable multi-hop and causal reasoning through a natural language question-answering interface, substantially reducing reliance on manual expertise. Evaluated on the authors’ newly introduced TraceQA-100 benchmark, TAAF achieves up to a 31.2% improvement in answer accuracy over baseline methods, demonstrating particularly strong performance on complex reasoning tasks.

Technology Category

Application Category

πŸ“ Abstract
Execution traces are a critical source of information for understanding, debugging, and optimizing complex software systems. However, traces from OS kernels or large-scale applications like Chrome or MySQL are massive and difficult to analyze. Existing tools rely on predefined analyses, and custom insights often require writing domain-specific scripts, which is an error-prone and time-consuming task. This paper introduces TAAF (Trace Abstraction and Analysis Framework), a novel approach that combines time-indexing, knowledge graphs (KGs), and large language models (LLMs) to transform raw trace data into actionable insights. TAAF constructs a time-indexed KG from trace events to capture relationships among entities such as threads, CPUs, and system resources. An LLM then interprets query-specific subgraphs to answer natural-language questions, reducing the need for manual inspection and deep system expertise. To evaluate TAAF, we introduce TraceQA-100, a benchmark of 100 questions grounded in real kernel traces. Experiments across three LLMs and multiple temporal settings show that TAAF improves answer accuracy by up to 31.2%, particularly in multi-hop and causal reasoning tasks. We further analyze where graph-grounded reasoning helps and where limitations remain, offering a foundation for next-generation trace analysis tools.
Problem

Research questions and friction points this paper is trying to address.

execution traces
trace analysis
large-scale software systems
debugging
optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

trace analysis
knowledge graph
large language models
time-indexed graph
natural language querying
πŸ”Ž Similar Papers
No similar papers found.
A
Alireza Ezaz
Brock University, Computer Science, St. Catharines, Ontario, Canada
G
Ghazal Khodabandeh
Brock University, Computer Science, St. Catharines, Ontario, Canada
M
Majid Babaei
Mcgil University, Computer Science, Montreal, Quebec, Canada
Naser Ezzati-Jivan
Naser Ezzati-Jivan
Associate Professor at Brock University
Software EngineeringSoftware AnalysisPerformance Evaluation