🤖 AI Summary
This work addresses the challenge of efficiently orchestrating AI services across the cloud-edge continuum under dynamic workloads and heterogeneous infrastructure, where minimizing latency, maximizing throughput, and optimizing resource utilization are often conflicting objectives. To tackle this, the authors propose AIF-Router, a novel framework that introduces active inference into edge AI service orchestration for the first time. By leveraging Bayesian state inference and minimizing expected free energy, AIF-Router enables unsupervised, online adaptive routing decisions without requiring offline training. Relying solely on real-time observability metrics, the approach maintains decision stability even in environments with unreliable or fluctuating edge devices. Experimental results demonstrate that AIF-Router exhibits robust online learning capabilities in unstable edge scenarios, confirming the feasibility and effectiveness of active inference for multi-objective, adaptive service orchestration.
📝 Abstract
Edge computing enables AI inference closer to data sources, reducing latency and bandwidth costs. However, orchestrating AI services across the cloud-edge continuum remains challenging due to dynamic workloads and infrastructure variability. We present AIF-Router, an Active Inference--based routing framework that autonomously learns to balance latency, throughput, and resource utilization across multi-tier AI services without offline training. AIF-Router performs Bayesian state inference and expected free energy minimization to guide routing decisions based on observability-driven real-time metrics. Despite device instability on edge nodes, AIF-Router exhibits stable online learning behavior and demonstrates the feasibility of applying Active Inference for adaptive AI service orchestration in unreliable edge environments. Our findings highlight both the promise and practical challenges of deploying self-adaptive decision-making frameworks for real-world edge AI systems.