🤖 AI Summary
This work addresses the poor stability and weak synergy arising from independently applying noise injection and Dropout during RNN training and inference. We propose Variational Adaptive Noise and Dropout (VAND), which formulates RNN training as a variational inference problem. For the first time, VAND jointly optimizes the noise scale on hidden states and the Dropout rate within a unified variational framework, yielding an implicit regularization mechanism that transcends conventional explicit regularization paradigms. Theoretically, our approach establishes a novel foundation for stable RNN learning. Experimentally, on a mobile manipulator imitation learning task, VAND is the only method to successfully reproduce complex temporal and periodic behaviors, achieving significant improvements in long-horizon stability and generalization performance.
📝 Abstract
This paper proposes a novel stable learning theory for recurrent neural networks (RNNs), so-called variational adaptive noise and dropout (VAND). As stabilizing factors for RNNs, noise and dropout on the internal state of RNNs have been separately confirmed in previous studies. We reinterpret the optimization problem of RNNs as variational inference, showing that noise and dropout can be derived simultaneously by transforming the explicit regularization term arising in the optimization problem into implicit regularization. Their scale and ratio can also be adjusted appropriately to optimize the main objective of RNNs, respectively. In an imitation learning scenario with a mobile manipulator, only VAND is able to imitate sequential and periodic behaviors as instructed. https://youtu.be/UOho3Xr6A2w