🤖 AI Summary
Addressing the dual challenges of label scarcity and multimodal data heterogeneity in federated learning, this paper proposes the first semi-supervised federated learning framework tailored for multimodal time-series data. Methodologically, it innovatively integrates modality-agnostic temporal contrastive learning with cross-modal representation alignment and introduces a similarity-guided dynamic aggregation strategy based on representation consistency to mitigate client-level semantic drift. Technically, the framework unifies self-supervised pretraining, federated averaging optimization, and modality-adaptive weight aggregation, enabling joint modeling of video, audio, and wearable sensor data. Extensive experiments on benchmarks such as UCF101 demonstrate significant improvements over state-of-the-art methods: using only 10% labeled data, the framework achieves 68.48% top-1 accuracy—outperforming the FedOpt baseline by 33.13 percentage points.
📝 Abstract
Real-world federated learning faces two key challenges: limited access to labelled data and the presence of heterogeneous multi-modal inputs. This paper proposes TACTFL, a unified framework for semi-supervised multi-modal federated learning. TACTFL introduces a modality-agnostic temporal contrastive training scheme that conducts representation learning from unlabelled client data by leveraging temporal alignment across modalities. However, as clients perform self-supervised training on heterogeneous data, local models may diverge semantically. To mitigate this, TACTFL incorporates a similarity-guided model aggregation strategy that dynamically weights client models based on their representational consistency, promoting global alignment. Extensive experiments across diverse benchmarks and modalities, including video, audio, and wearable sensors, demonstrate that TACTFL achieves state-of-the-art performance. For instance, on the UCF101 dataset with only 10% labelled data, TACTFL attains 68.48% top-1 accuracy, significantly outperforming the FedOpt baseline of 35.35%. Code will be released upon publication.