FedCVU: Federated Learning for Cross-View Video Understanding

📅 2026-03-23
📈 Citations: 0
Influential: 0
📄 PDF

career value

235K/year
🤖 AI Summary
This work addresses three key challenges in federated cross-view video understanding: non-IID data caused by view heterogeneity, inconsistent semantic representations across views, and high communication overhead. To tackle these issues, the authors propose FedCVU, a novel framework that integrates view-specific normalization (VS-Norm), a lightweight cross-view contrastive alignment module (CV-Align), and a selective layer aggregation strategy (SLA) to enable efficient, robust, and privacy-preserving multi-view collaborative learning. Experimental results demonstrate that FedCVU significantly improves accuracy on unseen views in both cross-view action recognition and person re-identification tasks, outperforming existing federated approaches while exhibiting strong robustness to domain shifts and communication constraints.

Technology Category

Application Category

📝 Abstract
Federated learning (FL) has emerged as a promising paradigm for privacy-preserving multi-camera video understanding. However, applying FL to cross-view scenarios faces three major challenges: (i) heterogeneous viewpoints and backgrounds lead to highly non-IID client distributions and overfitting to view-specific patterns, (ii) local distribution biases cause misaligned representations that hinder consistent cross-view semantics, and (iii) large video architectures incur prohibitive communication overhead. To address these issues, we propose FedCVU, a federated framework with three components: VS-Norm, which preserves normalization parameters to handle view-specific statistics; CV-Align, a lightweight contrastive regularization module to improve cross-view representation alignment; and SLA, a selective layer aggregation strategy that reduces communication without sacrificing accuracy. Extensive experiments on action understanding and person re-identification tasks under a cross-view protocol demonstrate that FedCVU consistently boosts unseen-view accuracy while maintaining strong seen-view performance, outperforming state-of-the-art FL baselines and showing robustness to domain heterogeneity and communication constraints.
Problem

Research questions and friction points this paper is trying to address.

federated learning
cross-view video understanding
non-IID data
representation alignment
communication overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning
Cross-View Video Understanding
Non-IID Data
Contrastive Alignment
Communication Efficiency
🔎 Similar Papers
2024-02-20International Conference on Machine LearningCitations: 30