When big data actually are low-rank, or entrywise approximation of certain function-generated matrices

📅 2024-07-03
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the common misconception that “large-scale data matrices are inherently low-rank.” It studies low-rank approximation of $n imes n$ sampled matrices generated by a smooth function $f(x, y)$ of two $m$-dimensional variables. Theoretically, it establishes that such matrices admit dimension-independent (i.e., independent of $m$) entrywise $varepsilon$-approximation by rank $O(log n cdot varepsilon^{-2} log varepsilon^{-1})$ matrices *if and only if* $f$ belongs to one of three analytic classes: inner-product–based, Euclidean-distance–based, or translation-invariant positive-definite kernels. This is the first rigorous characterization of the function families enabling dimension-independent low-rank approximation, accompanied by tight error bounds. The analysis is further extended to hierarchical tensor train (TT) decompositions. The results provide foundational theoretical support for efficient dimensionality reduction in large-scale data analytics and for approximating attention mechanisms in Transformers.

Technology Category

Application Category

📝 Abstract
The article concerns low-rank approximation of matrices generated by sampling a smooth function of two $m$-dimensional variables. We identify several misconceptions surrounding a claim that, for a specific class of analytic functions, such $n imes n$ matrices admit accurate entrywise approximation of rank that is independent of $m$ and grows as $log(n)$ -- colloquially known as ''big-data matrices are approximately low-rank''. We provide a theoretical explanation of the numerical results presented in support of this claim, describing three narrower classes of functions for which function-generated matrices can be approximated within an entrywise error of order $varepsilon$ with rank $mathcal{O}(log(n) varepsilon^{-2} log(varepsilon^{-1}))$ that is independent of the dimension $m$: (i) functions of the inner product of the two variables, (ii) functions of the Euclidean distance between the variables, and (iii) shift-invariant positive-definite kernels. We extend our argument to tensor-train approximation of tensors generated with functions of the ''higher-order inner product'' of their multiple variables. We discuss our results in the context of low-rank approximation of (a) growing datasets and (b) attention in transformer neural networks.
Problem

Research questions and friction points this paper is trying to address.

Low-rank approximation of function-generated matrices
Entrywise error analysis for specific function classes
Rank independence from dimension in big-data matrices
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-rank approximation for inner product functions
Euclidean distance functions with low-rank
Shift-invariant kernels enable rank independence
🔎 Similar Papers
No similar papers found.
S
Stanislav Budzinskiy
Faculty of Mathematics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria