🤖 AI Summary
This paper addresses three key challenges in distributed online personalized mean estimation: inter-agent data heterogeneity, scarcity of local samples, and the prohibitively high O(A²) time/space complexity of existing cooperative algorithms. To overcome these limitations, we propose the first scalable decentralized framework wherein agents self-organize into a sparse communication graph, eliminating reliance on all-to-all connectivity. We design two low-complexity distributed estimation algorithms—one based on belief propagation and the other on consensus—reducing per-iteration communication and computational complexity to O(r|A|log|A|) and O(r|A|), respectively. We establish theoretical guarantees showing asymptotically optimal estimation under heterogeneous distributions and derive tight statistical error bounds. Extensive experiments demonstrate significant improvements in efficiency, estimation accuracy, and scalability over state-of-the-art methods.
📝 Abstract
In numerous settings, agents lack sufficient data to directly learn a model. Collaborating with other agents may help, but it introduces a bias-variance trade-off, when local data distributions differ. A key challenge is for each agent to identify clients with similar distributions while learning the model, a problem that remains largely unresolved. This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean. Existing algorithms face impractical space and time complexities (quadratic in the number of agents A). To address scalability challenges, we propose a framework where agents self-organize into a graph, allowing each agent to communicate with only a selected number of peers r. We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach, with complexity of O( r |A| log |A|) and O(r |A|), respectively. We establish conditions under which both algorithms yield asymptotically optimal estimates and offer a theoretical characterization of their performance.