Provably Robust Federated Reinforcement Learning

📅 2025-02-12

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

Federated Reinforcement Learning (FRL) under decentralized architectures is highly vulnerable to poisoning attacks—particularly novel normalized angular attacks—against which existing Byzantine-robust aggregation methods fail. Method: This paper first theoretically characterizes FRL’s vulnerability under angular deviation scenarios and proposes the first provably robust, integrated FRL framework. It employs multi-global-policy co-training and synergistically combines majority-voting with geometric-median-based policy-level aggregation. Contribution/Results: The framework provides unified theoretical robustness guarantees against both known and unknown Byzantine attacks. Experiments across multi-task settings demonstrate that it suppresses performance degradation under normalized angular attacks by 47%–89%, while preserving ≥92% of original task performance—substantially outperforming state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Federated reinforcement learning (FRL) allows agents to jointly learn a global decision-making policy under the guidance of a central server. While FRL has advantages, its decentralized design makes it prone to poisoning attacks. To mitigate this, Byzantine-robust aggregation techniques tailored for FRL have been introduced. Yet, in our work, we reveal that these current Byzantine-robust techniques are not immune to our newly introduced Normalized attack. Distinct from previous attacks that targeted enlarging the distance of policy updates before and after an attack, our Normalized attack emphasizes on maximizing the angle of deviation between these updates. To counter these threats, we develop an ensemble FRL approach that is provably secure against both known and our newly proposed attacks. Our ensemble method involves training multiple global policies, where each is learnt by a group of agents using any foundational aggregation rule. These well-trained global policies then individually predict the action for a specific test state. The ultimate action is chosen based on a majority vote for discrete action systems or the geometric median for continuous ones. Our experimental results across different settings show that the Normalized attack can greatly disrupt non-ensemble Byzantine-robust methods, and our ensemble approach offers substantial resistance against poisoning attacks.

Problem

Research questions and friction points this paper is trying to address.

Mitigates poisoning in Federated Reinforcement Learning

Introduces Normalized attack on policy updates

Develops secure ensemble FRL approach

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble FRL approach

Majority vote strategy

Geometric median technique

🔎 Similar Papers

No similar papers found.