🤖 AI Summary
This work addresses low-rank shared feature recovery in decentralized multi-task representation learning under communication constraints. We consider a setting where task-specific data are distributed across nodes, and inter-node communication is restricted to a sparse network topology. To this end, we propose the first alternating projected gradient algorithm with provable accuracy guarantees for this decentralized setting. Our method jointly incorporates low-rank structural constraints and decentralized optimization, achieving communication complexity independent of the target accuracy—marking a significant improvement over existing distributed algorithms whose communication costs scale with accuracy. Theoretical analysis provides the first unified characterization of time, communication, and sample complexities. Empirical evaluations demonstrate reduced communication overhead across diverse network topologies and, in certain scenarios, superior performance compared to centralized federated learning baselines.
📝 Abstract
Representation learning is a widely adopted framework for learning in data-scarce environments, aiming to extract common features from related tasks. While centralized approaches have been extensively studied, decentralized methods remain largely underexplored. We study decentralized multi-task representation learning in which the features share a low-rank structure. We consider multiple tasks, each with a finite number of data samples, where the observations follow a linear model with task-specific parameters. In the decentralized setting, task data are distributed across multiple nodes, and information exchange between nodes is constrained by a communication network. The goal is to recover the underlying feature matrix whose rank is much smaller than both the parameter dimension and the number of tasks. We propose a new alternating projected gradient and minimization algorithm with provable accuracy guarantees. We provide comprehensive characterizations of the time, communication, and sample complexities. Importantly, the communication complexity is independent of the target accuracy, which significantly reduces communication cost compared to prior methods. Numerical simulations validate the theoretical analysis across different dimensions and network topologies, and demonstrate regimes in which decentralized learning outperforms centralized federated approaches.