Covariant Gradient Descent

📅 2025-04-07

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This paper addresses the geometric inconsistency of gradient descent across arbitrary coordinate systems and trainable-curvature manifolds. We propose Covariant Gradient Descent (CGD), a framework grounded in covariant differential geometry that explicitly constructs a covariant force vector (first-order statistical moment) and a covariant metric tensor (second-order statistical moment), thereby achieving the first strictly covariant formulation of gradient descent. Theoretically, CGD is provably invariant under arbitrary coordinate transformations and manifold curvature changes, while retaining linear computational complexity. RMSProp and Adam emerge as special cases of CGD in Euclidean space under specific metric choices; moreover, CGD enables their geometric generalization and performance enhancement. Empirical results demonstrate that CGD significantly outperforms mainstream optimizers in both convergence stability and model generalization across diverse tasks.

Technology Category

Application Category

📝 Abstract

We present a manifestly covariant formulation of the gradient descent method, ensuring consistency across arbitrary coordinate systems and general curved trainable spaces. The optimization dynamics is defined using a covariant force vector and a covariant metric tensor, both computed from the first and second statistical moments of the gradients. These moments are estimated through time-averaging with an exponential weight function, which preserves linear computational complexity. We show that commonly used optimization methods such as RMSProp and Adam correspond to special limits of the covariant gradient descent (CGD) and demonstrate how these methods can be further generalized and improved.

Problem

Research questions and friction points this paper is trying to address.

Ensures consistent gradient descent across arbitrary coordinate systems

Defines optimization using covariant force and metric tensors

Generalizes and improves methods like RMSProp and Adam

Innovation

Methods, ideas, or system contributions that make the work stand out.

Covariant gradient descent across coordinate systems

Uses covariant force vector and metric tensor

Exponential time-averaging for gradient moments

🔎 Similar Papers

No similar papers found.