🤖 AI Summary
This study addresses the lack of theoretical guidance in selecting embeddings for symmetric positive definite (SPD) manifold learning by constructing a unified Transformer framework to systematically investigate the impact of Bures–Wasserstein SPD (BWSPD), Log-Euclidean, and Euclidean embeddings on EEG covariance matrix classification. It establishes, for the first time, a theoretical link between SPD embedding geometry and optimization dynamics, proposing BN-Embed as an approximate Riemannian normalization scheme and proving that the isometric property of BWSPD is uniquely determined by the condition number ratio κ. Experimental results demonstrate that the Log-Euclidean Transformer achieves state-of-the-art performance across three EEG paradigms, significantly outperforming conventional Riemannian methods, while BWSPD excels in high-dimensional settings—yielding a 26% accuracy gain over prior approaches in a 56-channel ERP task.
📝 Abstract
Spatial covariance matrices of EEG signals are Symmetric Positive Definite (SPD) and lie on a Riemannian manifold, yet the theoretical connection between embedding geometry and optimization dynamics remains unexplored. We provide a formal analysis linking embedding choice to gradient conditioning and numerical stability for SPD manifolds, establishing three theoretical results: (1) BWSPD's $\sqrt{\kappa}$ gradient conditioning (vs $\kappa$ for Log-Euclidean) via Daleckii-Kre\u{\i}n matrices provides better gradient conditioning on high-dimensional inputs ($d \geq 22$), with this advantage reducing on low-dimensional inputs ($d \leq 8$) where eigendecomposition overhead dominates; (2) Embedding-Space Batch Normalization (BN-Embed) approximates Riemannian normalization up to $O(\varepsilon^2)$ error, yielding $+26\%$ accuracy on 56-channel ERP data but negligible effect on 8-channel SSVEP data, matching the channel-count-dependent prediction; (3) bi-Lipschitz bounds prove BWSPD tokens preserve manifold distances with distortion governed solely by the condition ratio $\kappa$. We validate these predictions via a unified Transformer framework comparing BWSPD, Log-Euclidean, and Euclidean embeddings within identical architecture across 1,500+ runs on three EEG paradigms (motor imagery, ERP, SSVEP; 36 subjects). Our Log-Euclidean Transformer achieves state-of-the-art performance on all datasets, substantially outperforming classical Riemannian classifiers and recent SPD baselines, while BWSPD offers competitive accuracy with similar training time.