🤖 AI Summary
This work addresses the longstanding trade-off in deep learning between the efficiency of fixed-structure linear operators (e.g., FFT) and the expressive power of learnable dense matrices, which are typically memory-prohibitive at scale. The authors propose LAPLEX, a trainable phase Laplacian operator that implicitly defines dense matrices via learnable coordinate anchors, enabling matrix–vector multiplications up to 10⁹ dimensions without explicitly storing the full matrix. LAPLEX achieves FFT-like computational scaling while decoupling model expressivity from storage cost for the first time, facilitating data-adaptive global interactions. The method is successfully applied to compact interpretable classification heads and high-dimensional covariance modeling, preserving spatial structure on images with up to 3×10⁶ dimensions—without relying on convolutional inductive biases—thereby substantially expanding the scale and applicability of trainable dense layers.
📝 Abstract
Fast linear algebra in deep learning usually comes with a choice: fixed geometry and exact computation, as in the Fourier transform, or adaptive geometry paid for by dense parameters, random features, or low-rank surrogates. To move beyond this trade-off, we introduce LAPLEX, a class of exact, trainable (phased) Laplace-kernel operators.
A LAPLEX layer is a typically full-rank dense matrix, implicitly defined by learnable coordinate anchors, with FFT-like scaling. Consequently, it supports trainable matrix--vector operations at vector dimensions up to $10^9$ on modern GPUs. As a neural layer, it yields compact projections and classification heads interpretable as soft, trainable routing models. The same primitive also serves as an efficient Gram operator, enabling high-dimensional covariance models on flattened images of dimension $3 \cdot 10^6$ that preserve visible spatial structure without imposing convolutional bias. These applications reflect a single principle: dense geometry can be learned without storing a dense matrix, which enables data-adaptive global interactions in regimes where ordinary dense layers are out of reach. In this sense, LAPLEX separates expressivity from storage cost: it behaves like a dense trainable matrix, but is represented and applied through a small structured set of parameters.