🤖 AI Summary
This paper addresses the excessive arithmetic and communication overhead of the Jacobi method for computing eigenvalues of symmetric matrices and singular value decompositions (SVD). To mitigate this, we propose two optimized variants: block Jacobi and recursive block Jacobi. We theoretically establish that the classical block Jacobi algorithm achieves the asymptotic communication lower bound while maintaining $O(n^3)$ arithmetic complexity. Furthermore, the recursive block variant, when combined with fast matrix multiplication (exponent $omega_0 < 3$), simultaneously attains near-optimal arithmetic and communication complexity. Our approach integrates communication-avoiding techniques, parallel Jacobi iterations, and recursive matrix decomposition, significantly reducing memory access volume and inter-processor data movement. This work provides the first rigorous proof of communication optimality for Jacobi-type algorithms, establishing a scalable, high-efficiency computational paradigm for large-scale eigenanalysis and SVD.
📝 Abstract
In this paper, we analyze several versions of Jacobi's method for the symmetric eigenvalue problem. Our goal throughout is to reduce the asymptotic cost of the algorithm as much as possible, as measured by the number of arithmetic operations performed and associated (sequential or parallel) communication, i.e., the amount of data moved between slow and fast memory or between processors in a network. In producing rigorous complexity bounds, we allow our algorithms to be built on both classic $O(n^3)$ matrix multiplication and fast, Strassen-like $O(n^{omega_0})$ alternatives. In the classical setting, we show that a blocked implementation of Jacobi's method attains the communication lower bound for $O(n^3)$ matrix multiplication (and is therefore expected to be communication optimal among $O(n^3)$ methods). In the fast setting, we demonstrate that a recursive version of blocked Jacobi can go even further, reaching essentially optimal complexity in both measures. We also discuss Jacobi-based SVD algorithms and a parallel version of block Jacobi, showing that analogous complexity bounds apply.