Vertical Federated Learning in Practice: The Good, the Bad, and the Ugly

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work exposes a critical gap between vertical federated learning (VFL) research and industrial deployment. Through empirical analysis of real-world cross-institutional data distributions, we identify four fundamental misalignments: feature non-alignment, label sparsity, sample heterogeneity, and communication constraints. To address this, we propose the first data-driven taxonomy of VFL algorithms, revealing several practical scenarios—such as sparse-label and highly heterogeneous settings—where existing methods fail. Integrating distribution-aware modeling, systematic literature review, and multi-dimensional evaluation metrics, we introduce a comprehensive, deployment-oriented VFL evaluation framework. Our analysis culminates in five prioritized technical directions for advancing trustworthy, scalable, and production-ready VFL systems—providing both theoretical foundations and a concrete roadmap for bridging research and practice.

Technology Category

Application Category

📝 Abstract
Vertical Federated Learning (VFL) is a privacy-preserving collaborative learning paradigm that enables multiple parties with distinct feature sets to jointly train machine learning models without sharing their raw data. Despite its potential to facilitate cross-organizational collaborations, the deployment of VFL systems in real-world applications remains limited. To investigate the gap between existing VFL research and practical deployment, this survey analyzes the real-world data distributions in potential VFL applications and identifies four key findings that highlight this gap. We propose a novel data-oriented taxonomy of VFL algorithms based on real VFL data distributions. Our comprehensive review of existing VFL algorithms reveals that some common practical VFL scenarios have few or no viable solutions. Based on these observations, we outline key research directions aimed at bridging the gap between current VFL research and real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Analyzes real-world VFL data distributions
Identifies gaps in VFL research and deployment
Proposes taxonomy for VFL algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vertical Federated Learning
Privacy-preserving collaboration
Data-oriented taxonomy
🔎 Similar Papers
No similar papers found.