🤖 AI Summary
This paper identifies a novel remote Rowhammer attack in federated learning (FL), wherein a malicious client—without server privileges, physical access, or implanted backdoors—induces targeted DRAM bit flips by manipulating sensor inputs (e.g., adversarial speech signals), thereby compromising memory integrity through high-frequency, localized gradient updates on specific memory rows.
Method: We introduce the first reinforcement learning agent into the FL attack framework to dynamically optimize adversarial observations for maximal gradient locality; integrate sparse gradient updates with DRAM row-activation modeling to amplify targeted row refresh rates. Evaluated on an automatic speech recognition (ASR) system, our approach achieves ~70% targeted row reactivation rate.
Contribution: This is the first practical demonstration of a remote, server-permissionless Rowhammer attack against FL, challenging the foundational security assumption that “clients are untrusted but servers are trusted.” It empirically exposes a previously overlooked hardware-level threat surface in distributed training architectures.
📝 Abstract
Federated Learning (FL) has the potential for simultaneous global learning amongst a large number of parallel agents, enabling emerging AI such as LLMs to be trained across demographically diverse data. Central to this being efficient is the ability for FL to perform sparse gradient updates and remote direct memory access at the central server. Most of the research in FL security focuses on protecting data privacy at the edge client or in the communication channels between the client and server. Client-facing attacks on the server are less well investigated as the assumption is that a large collective of clients offer resilience. Here, we show that by attacking certain clients that lead to a high frequency repetitive memory update in the server, we can remote initiate a rowhammer attack on the server memory. For the first time, we do not need backdoor access to the server, and a reinforcement learning (RL) attacker can learn how to maximize server repetitive memory updates by manipulating the client's sensor observation. The consequence of the remote rowhammer attack is that we are able to achieve bit flips, which can corrupt the server memory. We demonstrate the feasibility of our attack using a large-scale FL automatic speech recognition (ASR) systems with sparse updates, our adversarial attacking agent can achieve around 70% repeated update rate (RUR) in the targeted server model, effectively inducing bit flips on server DRAM. The security implications are that can cause disruptions to learning or may inadvertently cause elevated privilege. This paves the way for further research on practical mitigation strategies in FL and hardware design.