Contrastive Federated Learning with Tabular Data Silos

๐Ÿ“… 2024-09-10
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 1
๐Ÿ“„ PDF
๐Ÿค– AI Summary
In vertical federated learning (VFL), privacy constraints prohibit sharing raw or representative data across isolated data silos, while samples remain inherently misalignedโ€”posing a fundamental challenge for collaborative modeling. Method: We propose the first contrastive federated learning framework specifically designed for misaligned tabular data silos. Our approach integrates local contrastive representation learning with a tabular-structure-aware contrastive loss, coupled with privacy-preserving federated aggregation and a shared-data-free knowledge distillation mechanism to enable implicit cross-silo alignment and knowledge collaboration. Contribution/Results: This work pioneers the application of contrastive learning to VFL, eliminating reliance on sample alignment or inter-silo data sharing. Evaluated on multiple real-world tabular benchmarks, our framework achieves an average accuracy improvement of 3.2% over state-of-the-art baselines while strictly satisfying end-to-end privacy guarantees under standard threat models.

Technology Category

Application Category

๐Ÿ“ Abstract
Learning from vertical partitioned data silos is challenging due to the segmented nature of data, sample misalignment, and strict privacy concerns. Federated learning has been proposed as a solution. However, sample misalignment across silos often hinders optimal model performance and suggests data sharing within the model, which breaks privacy. Our proposed solution is Contrastive Federated Learning with Tabular Data Silos (CFL), which offers a solution for data silos with sample misalignment without the need for sharing original or representative data to maintain privacy. CFL begins with local acquisition of contrastive representations of the data within each silo and aggregates knowledge from other silos through the federated learning algorithm. Our experiments demonstrate that CFL solves the limitations of existing algorithms for data silos and outperforms existing tabular contrastive learning. CFL provides performance improvements without loosening privacy.
Problem

Research questions and friction points this paper is trying to address.

Addresses sample misalignment in data silos
Enhances privacy in federated learning
Improves tabular data learning efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive Federated Learning
Privacy-preserving data sharing
Local contrastive representation acquisition
๐Ÿ”Ž Similar Papers
No similar papers found.
A
Achmad Ginanjar
School of Electrical Engineering and Computer Science, The University of Queensland, Australia
X
Xue Li
School of Electrical Engineering and Computer Science, The University of Queensland, Australia
Wen Hua
Wen Hua
The Hong Kong Polytechnic University
DatabaseInformation SystemData MiningDeep Learning