🤖 AI Summary
Reinforcement learning (RL) policies often exhibit poor robustness under parameter perturbations, yet the underlying mechanisms governing parameter-level resilience remain poorly understood.
Method: We propose a neuron-level parameter classification framework grounded in stress response analysis, formally defining and identifying three distinct parameter categories—fragile, robust, and antifragile—where antifragile parameters *improve* policy performance under internal or external stressors (e.g., synaptic filtering or adversarial attacks). Integrating this with PPO, we systematically apply structured perturbations across MuJoCo continuous-control tasks and introduce a parameter sensitivity score to quantify functional roles.
Contribution/Results: We empirically verify the existence of antifragile parameters and demonstrate that selectively preserving or amplifying them significantly enhances policy robustness and generalization. This work establishes a novel paradigm for interpreting RL policy resilience and enables explainable, parameter-aware robust optimization.
📝 Abstract
This paper explores Reinforcement learning (RL) policy robustness by systematically analyzing network parameters under internal and external stresses. Inspired by synaptic plasticity in neuroscience, synaptic filtering introduces internal stress by selectively perturbing parameters, while adversarial attacks apply external stress through modified agent observations. This dual approach enables the classification of parameters as fragile, robust, or antifragile, based on their influence on policy performance in clean and adversarial settings. Parameter scores are defined to quantify these characteristics, and the framework is validated on PPO-trained agents in Mujoco continuous control environments. The results highlight the presence of antifragile parameters that enhance policy performance under stress, demonstrating the potential of targeted filtering techniques to improve RL policy adaptability. These insights provide a foundation for future advancements in the design of robust and antifragile RL systems.