🤖 AI Summary
This paper addresses distributed sampling and remote estimation of autoregressive Markov processes in multihop wireless networks, aiming to jointly minimize time-average estimation error and Age of Information (AoI). We tackle key challenges: statistically homogeneous agents, collision-prone shared channels, and the necessity of caching the latest sample. We establish, for the first time, that under a no-sensing policy, minimizing estimation error is equivalent to minimizing AoI. To this end, we propose a topology-transferable multi-agent reinforcement learning framework based on graph neural networks (GNNs), integrating permutation-equivariant architecture, recurrent state modeling, and centralized training with decentralized execution (CTDE). Experiments demonstrate that our approach significantly outperforms state-of-the-art methods across diverse network scales and dynamic environments. The learned policy exhibits strong cross-scale transferability, with performance gains increasing as node count grows. Moreover, the recurrent structure substantially improves robustness against non-stationary channel dynamics.
📝 Abstract
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a multi-hop wireless network with statistically-identical agents. Agents cache the most recent samples from others and communicate over wireless collision channels governed by an underlying graph topology. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies, considering both oblivious (where decision-making is independent of the physical processes) and non-oblivious policies (where decision-making depends on physical processes). We prove that in oblivious policies, minimizing estimation error is equivalent to minimizing the age of information. The complexity of the problem, especially the multi-dimensional action spaces and arbitrary network topologies, makes theoretical methods for finding optimal transmission policies intractable. We optimize the policies using a graphical multi-agent reinforcement learning framework, where each agent employs a permutation-equivariant graph neural network architecture. Theoretically, we prove that our proposed framework exhibits desirable transferability properties, allowing transmission policies trained on small- or moderate-size networks to be executed effectively on large-scale topologies. Numerical experiments demonstrate that (i) Our proposed framework outperforms state-of-the-art baselines; (ii) The trained policies are transferable to larger networks, and their performance gains increase with the number of agents; (iii) The training procedure withstands non-stationarity even if we utilize independent learning techniques; and, (iv) Recurrence is pivotal in both independent learning and centralized training and decentralized execution, and improves the resilience to non-stationarity in independent learning.