Partially Observable Restless Bandits for Age-Optimal Scheduling over Markov Channels

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This work addresses the problem of minimizing the average Age of Information (AoI) for status updates from multiple devices in bandwidth-constrained Internet-of-Things systems where channel states are unobservable. The scheduling problem is formulated as a partially observable restless multi-armed bandit. By applying Lagrangian relaxation, the problem is decoupled into tractable subproblems, and leveraging the threshold structure of the optimal policy, the authors establish indexability for the first time under Markovian channel dynamics. A closed-form Whittle-like index policy is then proposed, which substantially reduces computational complexity. The resulting algorithm achieves near-optimal performance—approaching the theoretical lower bound—in large-scale or resource-limited settings, significantly outperforming existing baseline methods.

📝 Abstract

There is a surge of need for fresh information with the overwhelming proliferation of the Internet of Things (IoT) applications. To characterize the information freshness perceived by the destination, the age of information (AoI) has been proposed. In this paper, we consider an IoT system with multiple devices sending status update packets to a central controller through time-correlated Markov channels and assume that the instantaneous channel states are not available to the central controller before making scheduling decisions. To ensure information freshness, we investigate a timely scheduling problem that minimizes the total expected time-average AoI under a strict communications bandwidth constraint. We formulate this problem as a partially observable restless multi-armed bandit problem. Using Lagrangian relaxation, we decouple the relaxed problem into multiple sub-problems and prove the threshold structure of their optimal policies. Armed with this property, we establish the indexability for the decoupled problem and design an algorithm to compute the Whittle's index. To reduce implementation complexity, we further derive the Whittle-like index in closed-form for low-complexity scheduling. Simulation results show that the proposed index-based policies outperform the baselines, remain close to the optimal policy or relaxed lower bound, and are especially effective when scheduling resources are limited or the network size is large.

Problem

Research questions and friction points this paper is trying to address.

Age of Information

Partially Observable

Restless Bandits

Markov Channels

Scheduling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Partially Observable Restless Bandits

Age of Information (AoI)

Whittle Index