🤖 AI Summary
This work proposes a differentiable framework based on Gumbel-Softmax reparameterization to enable adaptive selection of activation functions from a predefined set. Unlike conventional deep learning models that employ a fixed, single activation function—limiting their adaptability across diverse tasks—the proposed method achieves, for the first time, an input-agnostic, discrete, and differentiable mechanism for dynamically choosing the optimal activation function during training. The approach combines theoretical rigor with practical engineering utility, leveraging a modular design to enhance model flexibility. Extensive experiments on both synthetic and real-world datasets demonstrate that the method consistently selects near-optimal activation functions, leading to significant improvements in predictive accuracy and confirming its effectiveness and robustness.
📝 Abstract
Learning activation functions has emerged as a promising direction in deep learning, allowing networks to adapt activation mechanisms to task-specific demands. In this work, we introduce a novel framework that employs the Gumbel-Softmax trick to enable discrete yet differentiable selection among a predefined set of activation functions during training. Our method dynamically learns the optimal activation function independently of the input, thereby enhancing both predictive accuracy and architectural flexibility. Experiments on synthetic datasets show that our model consistently selects the most suitable activation function, underscoring its effectiveness. These results connect theoretical advances with practical utility, paving the way for more adaptive and modular neural architectures in complex learning scenarios.