Balancing Expressivity and Robustness: Constrained Rational Activations for Reinforcement Learning

📅 2025-07-19

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

In reinforcement learning, trainable rational activation functions enhance representational capacity but suffer from overestimation bias and feature collapse during continual learning—revealing a fundamental trade-off between expressivity and stability (i.e., plasticity). To address this, we propose a structurally constrained rational activation function: by restricting the denominator form and bounding output magnitude, we suppress gradient explosion and output scaling while preserving approximation capability, thereby significantly improving training robustness. Our method is optimized end-to-end via gradient-based learning. Empirically, it achieves superior performance and convergence stability on continuous-control benchmarks—MetaWorld and DeepMind Control Suite—and effectively mitigates catastrophic forgetting on MNIST and Split CIFAR-100 continual learning tasks, enhancing long-term feature retention. This work is the first to systematically identify and alleviate the stability bottlenecks of rational activations in both reinforcement and continual learning settings, establishing a new paradigm for learnable nonlinear modeling that jointly ensures high expressivity and reliability.

Technology Category

Application Category

📝 Abstract

Trainable activation functions, whose parameters are optimized alongside network weights, offer increased expressivity compared to fixed activation functions. Specifically, trainable activation functions defined as ratios of polynomials (rational functions) have been proposed to enhance plasticity in reinforcement learning. However, their impact on training stability remains unclear. In this work, we study trainable rational activations in both reinforcement and continual learning settings. We find that while their flexibility enhances adaptability, it can also introduce instability, leading to overestimation in RL and feature collapse in longer continual learning scenarios. Our main result is demonstrating a trade-off between expressivity and plasticity in rational activations. To address this, we propose a constrained variant that structurally limits excessive output scaling while preserving adaptability. Experiments across MetaWorld and DeepMind Control Suite (DMC) environments show that our approach improves training stability and performance. In continual learning benchmarks, including MNIST with reshuffled labels and Split CIFAR-100, we reveal how different constraints affect the balance between expressivity and long-term retention. While preliminary experiments in discrete action domains (e.g., Atari) did not show similar instability, this suggests that the trade-off is particularly relevant for continuous control. Together, our findings provide actionable design principles for robust and adaptable trainable activations in dynamic, non-stationary environments. Code available at: https://github.com/special114/rl_rational_plasticity.

Problem

Research questions and friction points this paper is trying to address.

Balancing expressivity and robustness in trainable activation functions

Addressing instability in reinforcement and continual learning settings

Designing constrained rational activations for improved training stability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Trainable rational activations enhance reinforcement learning plasticity

Constrained variant limits output scaling, preserves adaptability

Improves stability in continuous control, continual learning benchmarks

🔎 Similar Papers

Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning