BOLT: Block-Orthonormal Lanczos for Trace estimation of matrix functions

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

This paper addresses the trace estimation problem for extremely large matrices that cannot be fully stored or accessed. We propose the Block-Orthonormal Lanczos (BOLT) framework—the first to integrate orthogonal block probing with Lanczos iteration—and introduce a Subblock Stochastic Lanczos Quadrature (Subblock SLQ) variant, enabling trace estimation under stringent memory constraints where only matrix subblocks or local matrix-vector products are accessible. BOLT synergistically combines random probing, Krylov subspace projection, block orthogonalization, and subblock principal submatrix approximation, with theoretically guaranteed error bounds. Experiments demonstrate that BOLT achieves higher accuracy than Hutch++ in flat-spectrum regimes and significantly improves both accuracy and efficiency in estimating KL divergence and Wasserstein-2 distance under low-memory conditions.

Technology Category

Application Category

📝 Abstract

Efficient matrix trace estimation is essential for scalable computation of log-determinants, matrix norms, and distributional divergences. In many large-scale applications, the matrices involved are too large to store or access in full, making even a single matrix-vector (mat-vec) product infeasible. Instead, one often has access only to small subblocks of the matrix or localized matrix-vector products on restricted index sets. Hutch++ achieves optimal convergence rate but relies on randomized SVD and assumes full mat-vec access, making it difficult to apply in these constrained settings. We propose the Block-Orthonormal Stochastic Lanczos Quadrature (BOLT), which matches Hutch++ accuracy with a simpler implementation based on orthonormal block probes and Lanczos iterations. BOLT builds on the Stochastic Lanczos Quadrature (SLQ) framework, which combines random probing with Krylov subspace methods to efficiently approximate traces of matrix functions, and performs better than Hutch++ in near flat-spectrum regimes. To address memory limitations and partial access constraints, we introduce Subblock SLQ, a variant of BOLT that operates only on small principal submatrices. As a result, this framework yields a proxy KL divergence estimator and an efficient method for computing the Wasserstein-2 distance between Gaussians - both compatible with low-memory and partial-access regimes. We provide theoretical guarantees and demonstrate strong empirical performance across a range of high-dimensional settings.

Problem

Research questions and friction points this paper is trying to address.

Estimating matrix traces efficiently for large-scale applications

Handling matrices too large for full storage or access

Improving accuracy and simplicity in trace estimation methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Block-Orthonormal Lanczos for trace estimation

Subblock SLQ for low-memory matrix access

Proxy KL divergence and Wasserstein-2 estimator

🔎 Similar Papers

LancBiO: dynamic Lanczos-aided bilevel optimization via Krylov subspace