A Unified SPD Token Transformer Framework for EEG Classification: Systematic Comparison of Geometric Embeddings

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This study addresses the lack of theoretical guidance in selecting embeddings for symmetric positive definite (SPD) manifold learning by constructing a unified Transformer framework to systematically investigate the impact of Bures–Wasserstein SPD (BWSPD), Log-Euclidean, and Euclidean embeddings on EEG covariance matrix classification. It establishes, for the first time, a theoretical link between SPD embedding geometry and optimization dynamics, proposing BN-Embed as an approximate Riemannian normalization scheme and proving that the isometric property of BWSPD is uniquely determined by the condition number ratio κ. Experimental results demonstrate that the Log-Euclidean Transformer achieves state-of-the-art performance across three EEG paradigms, significantly outperforming conventional Riemannian methods, while BWSPD excels in high-dimensional settings—yielding a 26% accuracy gain over prior approaches in a 56-channel ERP task.

Technology Category

Application Category

📝 Abstract

Spatial covariance matrices of EEG signals are Symmetric Positive Definite (SPD) and lie on a Riemannian manifold, yet the theoretical connection between embedding geometry and optimization dynamics remains unexplored. We provide a formal analysis linking embedding choice to gradient conditioning and numerical stability for SPD manifolds, establishing three theoretical results: (1) BWSPD's $\sqrt{\kappa}$ gradient conditioning (vs $\kappa$ for Log-Euclidean) via Daleckii-Kre\u{\i}n matrices provides better gradient conditioning on high-dimensional inputs ($d \geq 22$), with this advantage reducing on low-dimensional inputs ($d \leq 8$) where eigendecomposition overhead dominates; (2) Embedding-Space Batch Normalization (BN-Embed) approximates Riemannian normalization up to $O(\varepsilon^2)$ error, yielding $+26\%$ accuracy on 56-channel ERP data but negligible effect on 8-channel SSVEP data, matching the channel-count-dependent prediction; (3) bi-Lipschitz bounds prove BWSPD tokens preserve manifold distances with distortion governed solely by the condition ratio $\kappa$. We validate these predictions via a unified Transformer framework comparing BWSPD, Log-Euclidean, and Euclidean embeddings within identical architecture across 1,500+ runs on three EEG paradigms (motor imagery, ERP, SSVEP; 36 subjects). Our Log-Euclidean Transformer achieves state-of-the-art performance on all datasets, substantially outperforming classical Riemannian classifiers and recent SPD baselines, while BWSPD offers competitive accuracy with similar training time.

Problem

Research questions and friction points this paper is trying to address.

EEG classification

Symmetric Positive Definite

Riemannian manifold

geometric embeddings

optimization dynamics

Innovation

Methods, ideas, or system contributions that make the work stand out.

SPD manifold

geometric embedding

Transformer