🤖 AI Summary
This work addresses the limitations of existing federated recommendation systems, which rely on ID-indexed communication and consequently suffer from high communication overhead, poor generalization, and sensitivity to noise. To overcome these issues, the authors propose a feature-indexed communication paradigm: discrete item codes are generated via residual quantization (RQ)-KMeans, clients learn codebook embeddings, and the server aggregates these codebooks instead of raw item embeddings, enabling controllable communication and cross-item information sharing. Furthermore, a collaborative-semantic dual-channel curriculum aggregation strategy is introduced to enhance model generalization and robustness. Experimental results on real-world datasets demonstrate that the proposed method significantly outperforms state-of-the-art federated recommendation algorithms while substantially reducing communication costs.
📝 Abstract
Federated recommendation provides a privacy-preserving solution for training recommender systems without centralizing user interactions. However, existing methods follow an ID-indexed communication paradigm that transmit whole item embeddings between clients and the server, which has three major limitations: 1) consumes uncontrollable communication resources, 2) the uploaded item information cannot generalize to related non-interacted items, and 3) is sensitive to client noisy feedback. To solve these problems, it is necessary to fundamentally change the existing ID-indexed communication paradigm. Therefore, we propose a feature-indexed communication paradigm that transmits feature code embeddings as codebooks rather than raw item embeddings. Building on this paradigm, we present RQFedRec, which assigns each item a list of discrete code IDs via Residual Quantization (RQ)-Kmeans. Each client generates and trains code embeddings as codebooks based on discrete code IDs provided by the server, and the server collects and aggregates these codebooks rather than item embeddings. This design makes communication controllable since the codebooks could cover all items, enabling updates to propagate across related items in same code ID. In addition, since code embedding represents many items, which is more robust to a single noisy item. To jointly capture semantic and collaborative information, RQFedRec further adopts a collaborative-semantic dual-channel aggregation with a curriculum strategy that emphasizes semantic codes early and gradually increases the contribution of collaborative codes over training. Extensive experiments on real-world datasets demonstrate that RQFedRec consistently outperforms state-of-the-art federated recommendation baselines while significantly reducing communication overhead.