Morphis: SLO-Aware Resource Scheduling for Microservices with Time-Varying Call Graphs

📅 2026-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing microservice resource scheduling approaches struggle to adapt to the dynamic evolution of runtime call graphs, often resulting in either resource wastage or violations of service-level objectives (SLOs). This work reveals, for the first time, the high concentration of invocation paths in large-scale production environments and introduces a joint optimization framework based on structural fingerprint decomposition and pattern-aware modeling. By identifying stable backbones and deviating subgraphs within call graphs through structural fingerprints, and integrating invocation pattern clustering to predict workload distributions, the proposed method constructs a global resource allocation model that satisfies end-to-end tail-latency SLOs. Evaluated on the TrainTicket benchmark, the approach reduces CPU consumption by 35–38% compared to state-of-the-art baselines while maintaining a 98.8% SLO compliance rate.

Technology Category

Application Category

📝 Abstract
Modern microservice systems exhibit continuous structural evolution in their runtime call graphs due to workload fluctuations, fault responses, and deployment activities. Despite this complexity, our analysis of over 500,000 production traces from ByteDance reveals a latent regularity: execution paths concentrate around a small set of recurring invocation patterns. However, existing resource management approaches fail to exploit this structure. Industrial autoscalers like Kubernetes HPA ignore inter-service dependencies, while recent academic methods often assume static topologies, rendering them ineffective under dynamic execution contexts. In this work, we propose Morphis, a dependency-aware provisioning framework that unifies pattern-aware trace analysis with global optimization. It introduces structural fingerprinting that decomposes traces into a stable execution backbone and interpretable deviation subgraphs. Then, resource allocation is formulated as a constrained optimization problem over predicted pattern distributions, jointly minimizing aggregate CPU usage while satisfying end-to-end tail-latency SLOs. Our extensive evaluations on the TrainTicket benchmark demonstrate that Morphis reduces CPU consumption by 35-38% compared to state-of-the-art baselines while maintaining 98.8% SLO compliance.
Problem

Research questions and friction points this paper is trying to address.

microservices
resource scheduling
dynamic call graphs
SLO
service dependencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

structural fingerprinting
pattern-aware scheduling
microservice SLO optimization
dynamic call graph
dependency-aware provisioning
🔎 Similar Papers
No similar papers found.
Y
Yu Tang
School of Software Technology, Zhejiang University, Ningbo 315100, China; Zhejiang Key Laboratory of Digital-Intelligence Service Technology, Hangzhou 310053, China
Hailiang Zhao
Hailiang Zhao
ZJU 100 Young Professor, Zhejiang University
Service ComputingEdge ComputingLearning-Augmented Algorithms
Rui Shi
Rui Shi
ByteDance, Inc.
Database SystemsBig DataDistributed SystemsCloud NativeProgramming Languages
C
Chuansheng Lu
ByteDance Ltd., Shanghai 800082, China
Y
Yifei Zhang
ByteDance Ltd., Shanghai 800082, China
K
Kingsum Chow
School of Software Technology, Zhejiang University, Ningbo 315100, China; Zhejiang Key Laboratory of Digital-Intelligence Service Technology, Hangzhou 310053, China
S
Shuiguang Deng
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China; Zhejiang Key Laboratory of Digital-Intelligence Service Technology