🤖 AI Summary
This work establishes the theoretical foundations of sorted permutation-invariant embeddings: (1) determining the minimal embedding dimension $D$ guaranteeing injectivity; and (2) characterizing tight bounds on their bi-Lipschitz constants. We propose an embedding construction based on linear matrix transformations followed by column-wise sorting, and rigorously derive necessary and sufficient conditions for injectivity. Crucially, we establish—for the first time—the quantitative dependence of the bi-Lipschitz constant on input size $n$, proving a distortion lower bound of $Omega(sqrt{n})$. Compared to prior work, our analysis significantly tightens the upper bound on the minimal embedding dimension, and the distortion depends only on $n^2$, entirely decoupled from the original feature dimension $d$. We further extend the framework to a dimensionality-reduction variant while preserving identical theoretical guarantees. These results provide essential theoretical support for invertibility and stability of permutation-invariant representations in graph neural networks.
📝 Abstract
We study the sorting-based embedding $β_{mathbf A} : mathbb R^{n imes d} o mathbb R^{n imes D}$, $mathbf X mapsto {downarrow}(mathbf X mathbf A)$, where $downarrow$ denotes column wise sorting of matrices. Such embeddings arise in graph deep learning where outputs should be invariant to permutations of graph nodes. Previous work showed that for large enough $D$ and appropriate $mathbf A$, the mapping $β_{mathbf A}$ is injective, and moreover satisfies a bi-Lipschitz condition. However, two gaps remain: firstly, the optimal size $D$ required for injectivity is not yet known, and secondly, no estimates of the bi-Lipschitz constants of the mapping are known.
In this paper, we make substantial progress in addressing both of these gaps. Regarding the first gap, we improve upon the best known upper bounds for the embedding dimension $D$ necessary for injectivity, and also provide a lower bound on the minimal injectivity dimension. Regarding the second gap, we construct matrices $mathbf A$, so that the bi-Lipschitz distortion of $β_{mathbf A} $ depends quadratically on $n$, and is completely independent of $d$. We also show that the distortion of $β_{mathbf A}$ is necessarily at least in $Ω(sqrt{n})$. Finally, we provide similar results for variants of $β_{mathbf A}$ obtained by applying linear projections to reduce the output dimension of $β_{mathbf A}$.