🤖 AI Summary
This work addresses the challenge of accurately estimating the log-determinant of large-scale, ill-conditioned positive semidefinite matrices—such as neural tangent kernels (NTKs) with tens of millions of samples—under severe memory constraints. We propose a hierarchical block LDL decomposition framework that circumvents the prohibitive O(n³) computational cost and O(n²) storage requirement of standard dense methods. Crucially, we introduce neural scaling laws into determinant estimation for the first time, establishing a power-law extrapolation model for pseudo-determinant ratios; this enables high-accuracy full-matrix log-det prediction from only a tiny subset of entries. Our method achieves approximately 10⁵× speedup on massive dense matrices, significantly outperforming existing approximation techniques in accuracy. Notably, it is the first to successfully compute the log-determinant of million-scale NTKs—previously intractable due to memory and computational limitations.
📝 Abstract
Calculating or accurately estimating log-determinants of large positive semi-definite matrices is of fundamental importance in many machine learning tasks. While its cubic computational complexity can already be prohibitive, in modern applications, even storing the matrices themselves can pose a memory bottleneck. To address this, we derive a novel hierarchical algorithm based on block-wise computation of the LDL decomposition for large-scale log-determinant calculation in memory-constrained settings. In extreme cases where matrices are highly ill-conditioned, accurately computing the full matrix itself may be infeasible. This is particularly relevant when considering kernel matrices at scale, including the empirical Neural Tangent Kernel (NTK) of neural networks trained on large datasets. Under the assumption of neural scaling laws in the test error, we show that the ratio of pseudo-determinants satisfies a power-law relationship, allowing us to derive corresponding scaling laws. This enables accurate estimation of NTK log-determinants from a tiny fraction of the full dataset; in our experiments, this results in a $sim$100,000$ imes$ speedup with improved accuracy over competing approximations. Using these techniques, we successfully estimate log-determinants for dense matrices of extreme sizes, which were previously deemed intractable and inaccessible due to their enormous scale and computational demands.