Path-conditioned training: a principled way to rescale ReLU neural networks

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of principled exploitation of rescaling symmetries in ReLU neural networks, which often leads to unstable training dynamics. Building upon the path-lifting framework, the authors propose a geometrically motivated rescaling criterion that aligns the kernel in path space with a reference kernel by minimizing this criterion. This approach systematically integrates rescaling symmetry into the optimization process to enhance training efficiency. Notably, it introduces the first conditioning-based rescaling strategy grounded in the geometric structure of path space. Numerical experiments demonstrate that the proposed method significantly accelerates the training of ReLU networks, highlighting the benefits of leveraging geometric insights for symmetry-aware optimization.

Technology Category

Application Category

📝 Abstract
Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate its potential to speed up training.
Problem

Research questions and friction points this paper is trying to address.

ReLU neural networks
rescaling symmetries
training dynamics
path-lifting
parameter conditioning
Innovation

Methods, ideas, or system contributions that make the work stand out.

path-conditioned training
ReLU neural networks
rescaling symmetry
path-lifting framework
parameter conditioning
🔎 Similar Papers
No similar papers found.
A
Arthur Lebeurrier
ENS de Lyon, CNRS, Inria, Université Claude Bernard Lyon 1, LIP, UMR 5668, 69342, Lyon cedex 07, France
Titouan Vayer
Titouan Vayer
Inria
optimal transportgraphsinverse problems
Rémi Gribonval
Rémi Gribonval
Inria & ENS de Lyon
signal processingmachine learningsparsityinverse problemsdimension reduction