๐ค AI Summary
This work addresses decentralized multi-task linear regression under data scarcity, where tasks share an underlying low-dimensional linear representation but lack centralized coordination.
Method: We propose the first serverless federated multi-task representation learning algorithm based on alternating projected gradient descent. It operates over a diffusion-based network topology, integrating distributed optimization, low-rank matrix recovery, and iterative projection-based updatesโwithout any central server.
Contribution/Results: Theoretically, we establish the first sample complexity lower bound and iteration complexity upper bound, rigorously proving optimal time and communication efficiency. Empirically, our algorithm achieves faster convergence and significantly lower communication overhead compared to state-of-the-art baselines, demonstrating both strong theoretical guarantees and practical efficacy.
๐ Abstract
Representation learning is a widely adopted framework for learning in data-scarce environments to obtain a feature extractor or representation from various different yet related tasks. Despite extensive research on representation learning, decentralized approaches remain relatively underexplored. This work develops a decentralized projected gradient descent-based algorithm for multi-task representation learning. We focus on the problem of multi-task linear regression in which multiple linear regression models share a common, low-dimensional linear representation. We present an alternating projected gradient descent and minimization algorithm for recovering a low-rank feature matrix in a diffusion-based decentralized and federated fashion. We obtain constructive, provable guarantees that provide a lower bound on the required sample complexity and an upper bound on the iteration complexity of our proposed algorithm. We analyze the time and communication complexity of our algorithm and show that it is fast and communication-efficient. We performed numerical simulations to validate the performance of our algorithm and compared it with benchmark algorithms.