Determinant Estimation under Memory Constraints and Neural Scaling Laws

📅 2025-03-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of accurately estimating the log-determinant of large-scale, ill-conditioned positive semidefinite matrices—such as neural tangent kernels (NTKs) with tens of millions of samples—under severe memory constraints. We propose a hierarchical block LDL decomposition framework that circumvents the prohibitive O(n³) computational cost and O(n²) storage requirement of standard dense methods. Crucially, we introduce neural scaling laws into determinant estimation for the first time, establishing a power-law extrapolation model for pseudo-determinant ratios; this enables high-accuracy full-matrix log-det prediction from only a tiny subset of entries. Our method achieves approximately 10⁵× speedup on massive dense matrices, significantly outperforming existing approximation techniques in accuracy. Notably, it is the first to successfully compute the log-determinant of million-scale NTKs—previously intractable due to memory and computational limitations.

Technology Category

Application Category

📝 Abstract
Calculating or accurately estimating log-determinants of large positive semi-definite matrices is of fundamental importance in many machine learning tasks. While its cubic computational complexity can already be prohibitive, in modern applications, even storing the matrices themselves can pose a memory bottleneck. To address this, we derive a novel hierarchical algorithm based on block-wise computation of the LDL decomposition for large-scale log-determinant calculation in memory-constrained settings. In extreme cases where matrices are highly ill-conditioned, accurately computing the full matrix itself may be infeasible. This is particularly relevant when considering kernel matrices at scale, including the empirical Neural Tangent Kernel (NTK) of neural networks trained on large datasets. Under the assumption of neural scaling laws in the test error, we show that the ratio of pseudo-determinants satisfies a power-law relationship, allowing us to derive corresponding scaling laws. This enables accurate estimation of NTK log-determinants from a tiny fraction of the full dataset; in our experiments, this results in a $sim$100,000$ imes$ speedup with improved accuracy over competing approximations. Using these techniques, we successfully estimate log-determinants for dense matrices of extreme sizes, which were previously deemed intractable and inaccessible due to their enormous scale and computational demands.
Problem

Research questions and friction points this paper is trying to address.

Estimate log-determinants of large matrices under memory constraints.
Address computational complexity and storage bottlenecks in machine learning.
Enable accurate NTK log-determinant estimation using neural scaling laws.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical algorithm for log-determinant calculation
Block-wise LDL decomposition for memory efficiency
Power-law scaling for pseudo-determinant estimation
🔎 Similar Papers
No similar papers found.
S
S. Ameli
ICSI and Department of Statistics, University of California, Berkeley
C
Christopher van der Heide
Dept. of Electrical and Electronic Engineering, University of Melbourne
Liam Hodgkinson
Liam Hodgkinson
University of Melbourne
probabilistic machine learningdeep learning theory
Fred Roosta
Fred Roosta
University of Queensland
Machine LearningNumerical OptimizationComputational StatisticsScientific Computing
M
Michael W. Mahoney
ICSI, LBNL, and Department of Statistics, University of California, Berkeley