PILOT: Policy-Informed Learned Optimization for Adaptive Deep Network Training

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the limitation of conventional optimizers, which employ fixed update structures and struggle to adapt to the dynamic shifts in gradient behavior—ranging from stable to noisy or inconsistent—during training. To overcome this, the authors propose PILOT, an online adaptive optimizer that, for the first time, leverages gradient direction consistency as a policy signal to dynamically modulate the combination of momentum, normalization, and sign-based updates in real time. Relying solely on first-order gradient information, PILOT maintains algorithmic simplicity while achieving substantial performance gains. Empirical results demonstrate that PILOT attains state-of-the-art accuracy of 95.71% on FashionMNIST with a CNN and 93.42% on CIFAR-10 with ResNet-18, outperforming existing optimizers.

📝 Abstract

Despite the central role of optimization in deep learning, most optimizers rely on update structures whose functional form is fixed before training begins. This static design can limit their ability to respond to changing gradient behavior across the loss landscape, where training may shift between stable, noisy, and inconsistent regimes. This study proposes PILOT (Policy-Informed Learned OpTimizer), an online optimizer that adapts its update behavior during training. Rather than using a fixed balance between momentum, normalization, and sign-based updates, PILOT uses gradient-direction agreement as a signal of local training stability. Conditioning the update rule on this agreement signal allows the optimizer to adjust its behavior when gradients become stable, noisy, or inconsistent. Experiments on FashionMNIST and CIFAR-10 show that PILOT consistently achieves the highest accuracy among the evaluated optimizers across convolutional settings. On the CNN architecture, PILOT reaches 94.13% on FashionMNIST and 81.94% on CIFAR-10. On ResNet-18, it further improves performance, reaching 95.71% on FashionMNIST and 93.42% on CIFAR-10. These results suggest that learning how to adapt the update structure during training can improve performance across both compact and deeper convolutional models while preserving a simple first-order optimization framework. The implementation of PILOT is publicly available at https://github.com/SattamAltwaim/PILOT.git

Problem

Research questions and friction points this paper is trying to address.

optimization

adaptive training

gradient behavior

deep learning

update structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive optimization

learned optimizer

gradient-direction agreement