🤖 AI Summary
Existing Riemannian Batch Normalization (RBN) methods exhibit insufficient robustness on ill-conditioned symmetric positive-definite (SPD) matrices, limiting their effectiveness in covariance-based representation learning.
Method: We propose a novel RBN method grounded in the Generalized Bures–Wasserstein Metric (GBWM), the first to incorporate a learnable GBWM into the batch normalization framework. Our approach couples matrix power transformations to enhance geometric modeling on the SPD manifold and employs Riemannian gradient descent for stable statistical estimation and parameter updates directly on the SPD manifold.
Contribution/Results: Extensive experiments demonstrate that our method significantly improves classification and regression accuracy of SPD neural networks across multiple benchmark datasets. Notably, it achieves faster convergence and superior generalization—especially under ill-conditioned covariance scenarios—establishing a more robust normalization paradigm for deep learning on SPD manifolds.
📝 Abstract
Covariance matrices have proven highly effective across many scientific fields. Since these matrices lie within the Symmetric Positive Definite (SPD) manifold - a Riemannian space with intrinsic non-Euclidean geometry, the primary challenge in representation learning is to respect this underlying geometric structure. Drawing inspiration from the success of Euclidean deep learning, researchers have developed neural networks on the SPD manifolds for more faithful covariance embedding learning. A notable advancement in this area is the implementation of Riemannian batch normalization (RBN), which has been shown to improve the performance of SPD network models. Nonetheless, the Riemannian metric beneath the existing RBN might fail to effectively deal with the ill-conditioned SPD matrices (ICSM), undermining the effectiveness of RBN. In contrast, the Bures-Wasserstein metric (BWM) demonstrates superior performance for ill-conditioning. In addition, the recently introduced Generalized BWM (GBWM) parameterizes the vanilla BWM via an SPD matrix, allowing for a more nuanced representation of vibrant geometries of the SPD manifold. Therefore, we propose a novel RBN algorithm based on the GBW geometry, incorporating a learnable metric parameter. Moreover, the deformation of GBWM by matrix power is also introduced to further enhance the representational capacity of GBWM-based RBN. Experimental results on different datasets validate the effectiveness of our proposed method.