FedCDC: A Collaborative Framework for Data Consumers in Federated Learning Market

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge in federated learning (FL) markets where budget-constrained data consumers struggle to recruit sufficient data owners—leading to degraded model performance—this paper proposes a collaborative recruitment and training framework. The framework identifies shared subtasks across consumers via subtask clustering, constructs multi-consumer joint submodels, and employs ensemble knowledge distillation to fuse submodel knowledge into each consumer’s global model, supported by a federated parameter coordination mechanism to ensure training stability. It establishes, for the first time, collaborative data utilization among multiple consumers in FL markets, introducing a novel three-tiered paradigm: “subtask discovery → joint training → distillation-based ensemble,” thereby overcoming the limitations of conventional one-to-one matching. Evaluations on three benchmark datasets demonstrate average accuracy improvements of 12.7%–18.3% for participating consumers, significantly mitigating performance degradation caused by restricted data access.

Technology Category

Application Category

📝 Abstract
Federated learning (FL) allows machine learning models to be trained on distributed datasets without directly accessing local data. In FL markets, numerous Data Consumers compete to recruit Data Owners for their respective training tasks, but budget constraints and competition can prevent them from securing sufficient data. While existing solutions focus on optimizing one-to-one matching between Data Owners and Data Consumers, we propose methodname{}, a novel framework that facilitates collaborative recruitment and training for Data Consumers with similar tasks. Specifically, methodname{} detects shared subtasks among multiple Data Consumers and coordinates the joint training of submodels specialized for these subtasks. Then, through ensemble distillation, these submodels are integrated into each Data Consumer global model. Experimental evaluations on three benchmark datasets demonstrate that restricting Data Consumers access to Data Owners significantly degrades model performance; however, by incorporating methodname{}, this performance loss is effectively mitigated, resulting in substantial accuracy gains for all participating Data Consumers.
Problem

Research questions and friction points this paper is trying to address.

Facilitates collaborative recruitment in FL
Detects shared subtasks among Data Consumers
Mitigates performance loss in model training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Collaborative recruitment for Data Consumers
Joint training of specialized submodels
Ensemble distillation for global model integration
🔎 Similar Papers
No similar papers found.
Z
Zhuan Shi
École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
P
Patrick Ohl
École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Boi Faltings
Boi Faltings
EPFL