🤖 AI Summary
This work addresses the challenge of optimizing information freshness in decentralized edge networks, where multiple clients operate under partial observability—unable to monitor access point states or other users’ actions. The problem is formulated as a non-stationary multi-armed bandit, and the authors propose the AGING BANDIT WITH ADAPTIVE RESET algorithm, which integrates an adaptive sliding window with a periodic reset mechanism to effectively handle both abrupt and gradual changes in reward distributions. By doing so, the method overcomes key limitations of conventional bandit approaches in partially observable, history-dependent, and coupled environments, while providing near-optimal theoretical performance guarantees. Extensive simulations demonstrate that the proposed algorithm significantly enhances information freshness in dynamic edge network settings.
📝 Abstract
We study a decentralized collaborative requesting problem that aims to optimize the information freshness of time-sensitive clients in edge networks consisting of multiple clients, access nodes (ANs), and servers. Clients request content through ANs acting as gateways, without observing AN states or the actions of other clients. We define the reward as the age of information reduction resulting from a client's selection of an AN, and formulate the problem as a non-stationary multi-armed bandit. In this decentralized and partially observable setting, the resulting reward process is history-dependent and coupled across clients, and exhibits both abrupt and gradual changes in expected rewards, rendering classical bandit-based approaches ineffective. To address these challenges, we propose the AGING BANDIT WITH ADAPTIVE RESET algorithm, which combines adaptive windowing with periodic monitoring to track evolving reward distributions. We establish theoretical performance guarantees showing that the proposed algorithm achieves near-optimal performance, and we validate the theoretical results through simulations.