🤖 AI Summary
Hierarchical inference (HI) in edge intelligence faces the challenge of dynamically estimating the local model’s correct-inference probability—a problem termed hierarchical inference learning (HIL)—under non-stationary data distributions and heterogeneous offloading costs.
Method: This paper proposes a confidence-driven online learning framework. It models the local model’s correct-inference probability as a monotonic increasing function of its output confidence and designs two UCB-based policies: HI-LCB and its lightweight variant HI-LCB-lite.
Contribution/Results: HI-LCB achieves an optimal $O(log T)$ regret bound under dynamic data and variable offloading costs—the first such guarantee for HIL. HI-LCB-lite further reduces per-sample computational complexity to $O(1)$, enabling deployment on resource-constrained edge devices. The theoretical analysis is rigorous, and extensive simulations on real-world datasets demonstrate that both algorithms significantly outperform state-of-the-art baselines, jointly optimizing latency, bandwidth consumption, and inference accuracy.
📝 Abstract
This work focuses on Hierarchical Inference (HI) in edge intelligence systems, where a compact Local-ML model on an end-device works in conjunction with a high-accuracy Remote-ML model on an edge-server. HI aims to reduce latency, improve accuracy, and lower bandwidth usage by first using the Local-ML model for inference and offloading to the Remote-ML only when the local inference is likely incorrect. A critical challenge in HI is estimating the likelihood of the local inference being incorrect, especially when data distributions and offloading costs change over time -- a problem we term Hierarchical Inference Learning (HIL). We introduce a novel approach to HIL by modeling the probability of correct inference by the Local-ML as an increasing function of the model's confidence measure, a structure motivated by empirical observations but previously unexploited. We propose two policies, HI-LCB and HI-LCB-lite, based on the Upper Confidence Bound (UCB) framework. We demonstrate that both policies achieve order-optimal regret of $O(log T)$, a significant improvement over existing HIL policies with $O(T^{2/3})$ regret guarantees. Notably, HI-LCB-lite has an $O(1)$ per-sample computational complexity, making it well-suited for deployment on devices with severe resource limitations. Simulations using real-world datasets confirm that our policies outperform existing state-of-the-art HIL methods.