Large Language Models as Realistic Microservice Trace Generators

📅 2024-12-16

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Real-world microservice call-graph traces are scarce, hindering system behavior analysis and resource management. Method: This paper pioneers the use of large language models (LLMs) for synthetic workload trace generation, proposing a two-stage paradigm: (1) recursive sequential generation to explicitly model the hierarchical structure and dynamic evolution of call graphs; and (2) instruction-tuning guided by implicit structural constraints to ensure topological consistency and coverage of rare scenarios. The approach integrates graph-structured modeling, LLM fine-tuning, and a synthetic-data-driven evaluation framework. Contribution/Results: Experiments demonstrate that the generated traces significantly outperform state-of-the-art methods in diversity, topological fidelity, and downstream utility—including feature prediction and missing-trace completion. The synthetic traces effectively substitute real traces for microservice resource optimization and system-level analysis.

Technology Category

Application Category

📝 Abstract

Workload traces are essential to understand complex computer systems' behavior and manage processing and memory resources. Since real-world traces are hard to obtain, synthetic trace generation is a promising alternative. This paper proposes a first-of-a-kind approach that relies on training a large language model (LLM) to generate synthetic workload traces, specifically microservice call graphs. To capture complex and arbitrary hierarchical structures and implicit constraints in such traces, we show how to fine-tune LLMs to generate recursively, making call graph generation a sequence of easier steps. To further enforce learning constraints in traces and generate uncommon situations, we argue for applying additional instruction tuning steps to align our model with the desired trace features. Our evaluation results show that we can generate diverse realistic traces under various conditions and outperform existing methods in accuracy and validity. We demonstrate that our synthetically generated traces can effectively replace real data to optimize important microservice management tasks. Additionally, our model adapts to downstream trace-related tasks, such as predicting key trace features and infilling missing data.

Problem

Research questions and friction points this paper is trying to address.

Generating realistic synthetic microservice workload traces.

Fine-tuning LLMs for recursive call graph generation.

Enhancing trace generation accuracy and validity.

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM for synthetic trace generation

Fine-tuned for recursive call graphs

Instruction tuning for trace constraints

🔎 Similar Papers

BurstGPT: A Real-world Workload Dataset to Optimize LLM Serving Systems