Multi-scale Feature Learning Dynamics: Insights for Double Descent – Analyzed epoch-wise double descent using statistical physics tools and derived analytical expressions for generalization error evolution
Gradient Starvation: A Learning Proclivity in Neural Networks – Identified and formalized 'Gradient Starvation', a phenomenon in over-parameterized neural networks
On the Learning Dynamics of Deep Neural Networks – Proved sigmoidal shape of classification error in deep networks under linear separability assumptions
Deconstructing the Ladder Network Architecture – Conducted ablation studies showing lateral connections are most critical in Ladder Networks
Negative momentum for improved game dynamics – Proposed alternating gradient updates with negative momentum for stable convergence in GANs
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations – Introduced Zoneout, a stochastic regularization method for RNNs