GSVD for Geometry-Grounded Dataset Comparison: An Alignment Angle Is All You Need

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes an interpretable geometric framework based on Generalized Singular Value Decomposition (GSVD) for sample-level comparison of two datasets while preserving their intrinsic structures. By constructing a joint subspace coordinate system and imposing the common-space constraint \(Ax = By = z\), the method disentangles shared and dataset-specific directions and introduces a sample alignment angle \(\theta(z)\) to quantify the relative explanatory contribution of each sample across the two datasets. As the first framework to leverage subspace alignment angles for sample-wise comparison, the proposed alignment angle serves as an interpretable diagnostic tool. Experiments on MNIST illustrate canonical GSVD directions and the distribution of alignment angles, demonstrating the effectiveness of a binary classifier built upon \(\theta(z)\).

Technology Category

Application Category

📝 Abstract
Geometry-grounded learning asks models to respect structure in the problem domain rather than treating observations as arbitrary vectors. Motivated by this view, we revisit a classical but underused primitive for comparing datasets: linear relations between two data matrices, expressed via the co-span constraint $Ax = By = z$ in a shared ambient space. To operationalize this comparison, we use the generalized singular value decomposition (GSVD) as a joint coordinate system for two subspaces. In particular, we exploit the GSVD form $A = HCU$, $B = HSV$ with $C^{\top}C + S^{\top}S = I$, which separates shared versus dataset-specific directions through the diagonal structure of $(C, S)$. From these factors we derive an interpretable *angle score* $\theta(z) \in [0, \pi/2]$ for a sample $z$, quantifying whether z is explained relatively more by $A$, more by $B$, or comparably by both. The primary role of $\theta(z)$ is as a *per-sample geometric diagnostic*. We illustrate the behavior of the score on MNIST through angle distributions and representative GSVD directions. A binary classifier derived from $\theta(z)$ is presented as an illustrative application of the score as an interpretable diagnostic tool.
Problem

Research questions and friction points this paper is trying to address.

geometry-grounded learning
dataset comparison
generalized singular value decomposition
subspace alignment
geometric diagnostic
Innovation

Methods, ideas, or system contributions that make the work stand out.

GSVD
geometry-grounded learning
dataset comparison
alignment angle
interpretable diagnostics
🔎 Similar Papers
No similar papers found.
E
Eduarda de Souza Marques
Institute of Computing, UFRJ
A
Arthur Sobrinho Ferreira da Rocha
Institute of Computing, UFRJ
J
Joao Paixao
Institute of Computing, UFRJ
H
Heudson Mirandola
Institute of Mathematics, UFRJ
Daniel Sadoc Menasche
Daniel Sadoc Menasche
Federal University of Rio de Janeiro (UFRJ), Institute of Computing (IC), Brazil
Performance EvaluationComputer NetworksMachine LearningSecurityFormal Methods