Streaming Federated Learning with Markovian Data

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the feasibility and convergence of federated learning (FL) under non-stationary, streaming data generated by time-varying Markov processes—characterized by non-i.i.d., temporal dependence, and sequential arrival. Addressing the limitation of classical FL theory, which assumes static i.i.d. data, we establish the first theoretical framework for FL over non-stationary Markov data streams. We analyze the convergence behavior of mini-batch SGD, Local SGD, and their momentum variants under this setting. Leveraging tools from stochastic optimization, Markov chain mixing-time analysis, and distributed convergence proofs, we derive rigorous convergence bounds for smooth non-convex objectives. Our results show that sample complexity improves with increasing numbers of clients, while communication complexity matches that of the i.i.d. case. These findings confirm FL’s scalability in dynamic, streaming edge environments and provide foundational theoretical guarantees for real-time edge intelligence.

Technology Category

Application Category

📝 Abstract
Federated learning (FL) is now recognized as a key framework for communication-efficient collaborative learning. Most theoretical and empirical studies, however, rely on the assumption that clients have access to pre-collected data sets, with limited investigation into scenarios where clients continuously collect data. In many real-world applications, particularly when data is generated by physical or biological processes, client data streams are often modeled by non-stationary Markov processes. Unlike standard i.i.d. sampling, the performance of FL with Markovian data streams remains poorly understood due to the statistical dependencies between client samples over time. In this paper, we investigate whether FL can still support collaborative learning with Markovian data streams. Specifically, we analyze the performance of Minibatch SGD, Local SGD, and a variant of Local SGD with momentum. We answer affirmatively under standard assumptions and smooth non-convex client objectives: the sample complexity is proportional to the inverse of the number of clients with a communication complexity comparable to the i.i.d. scenario. However, the sample complexity for Markovian data streams remains higher than for i.i.d. sampling.
Problem

Research questions and friction points this paper is trying to address.

Investigates FL feasibility with non-stationary Markovian data streams
Compares sample complexity of FL algorithms under Markovian vs i.i.d. data
Analyzes communication efficiency of Local SGD variants in streaming FL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Streaming FL with non-stationary Markov data
Analyzed Minibatch SGD, Local SGD variants
Sample complexity scales with client count
🔎 Similar Papers
No similar papers found.
T
Tan-Khiem Huynh
Inria, Université de Lyon, CITI, INSA Lyon, 69100 Villeurbanne, France
Malcolm Egan
Malcolm Egan
Inria
Giovanni Neglia
Giovanni Neglia
Inria Sophia Antipolis Méditerranée
computer networkssmart gridsmodelingperformance evaluation
J
Jean-Marie Gorce
Inria, Université de Lyon, CITI, INSA Lyon, 69100 Villeurbanne, France