Log-Normal Multiplicative Dynamics for Stable Low-Precision Training of Large Networks

📅 2025-06-21

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

Low-precision neural network training suffers from a fundamental trade-off between stability and accuracy. Method: Inspired by multiplicative synaptic noise dynamics in biological systems, we propose a Bayesian learning framework grounded in a log-normal posterior assumption. We introduce, for the first time, multiplicative dynamics into artificial neural network training—combining multiplicative noise with implicit regularization in parameter updates—requiring only one additional vector of parameters, thus minimizing memory overhead. The method supports fully low-precision forward passes and is compatible with mainstream large architectures, including ViT and GPT-2. Contribution/Results: Experiments demonstrate stable, from-scratch training under purely low-precision forward computation; enhanced learning and inference robustness on energy-efficient hardware; and final accuracies matching those achieved by Adam. This work establishes a novel paradigm for efficient Bayesian learning tailored to edge AI.

Technology Category

Application Category

📝 Abstract

Studies in neuroscience have shown that biological synapses follow a log-normal distribution whose transitioning can be explained by noisy multiplicative dynamics. Biological networks can function stably even under dynamically fluctuating conditions arising due to unreliable synaptic transmissions. Here we ask: Is it possible to design similar multiplicative training in artificial neural networks? To answer this question, we derive a Bayesian learning rule that assumes log-normal posterior distributions over weights which gives rise to a new Log-Normal Multiplicative Dynamics (LMD) algorithm. The algorithm uses multiplicative updates with both noise and regularization applied multiplicatively. The method is as easy to implement as Adam and only requires one additional vector to store. Our results show that LMD achieves stable and accurate training-from-scratch under low-precision forward operations for Vision Transformer and GPT-2. These results suggest that multiplicative dynamics, a biological feature, may enable stable low-precision inference and learning on future energy-efficient hardware.

Problem

Research questions and friction points this paper is trying to address.

Design multiplicative training for artificial neural networks

Achieve stable low-precision training in large networks

Implement biologically inspired log-normal weight dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Log-normal posterior distributions for weights

Multiplicative updates with noise and regularization

Stable low-precision training for large networks

🔎 Similar Papers

Spike No More: Stabilizing the Pre-training of Large Language Models