🤖 AI Summary
This work addresses the challenge in large-scale MU-MIMO long-term beamforming where the condition number of the channel covariance matrix deteriorates with the dynamic range of user signal-to-noise ratios, leading to a sharp increase in conjugate gradient (CG) inversion iterations and consequently higher latency and energy consumption. To mitigate this, the paper proposes a hardware-oriented low-rank preconditioning framework that constructs a preconditioner in the beamspace domain using dominant eigenpairs of the covariance matrix. By integrating randomized complex eigendecomposition (RC-EVD) with Cholesky-based QR factorization (QRC), the approach reformulates the core computations into GEMM operations and small-scale triangular solves amenable to systolic array implementation. This novel fusion of beamspace sparsification and low-rank preconditioning significantly accelerates CG convergence. Ray-tracing simulations demonstrate a 2–3× reduction in iteration count while maintaining post-equalization SINR performance comparable to exact matrix inversion.
📝 Abstract
Long-term beamforming substantially reduces the channel estimation and inversion overhead of conventional massive MU-MIMO receivers; yet, its construction still hinges on the inversion of a large Hermitian matrix, whose condition number deteriorates with the per-user SNR dynamic range. When this inversion is approximated in hardware via the conjugate gradient (CG) algorithm, the deterioration directly inflates the iteration count and, consequently, the energy and latency budget. We propose a hardware-friendly low-rank preconditioning framework that targets exactly this bottleneck. The preconditioner is constructed from the top eigenpairs of the long-term covariance matrix through a randomized complex eigenvalue decomposition (RC-EVD), whose inner QR factorizations are realized via a Cholesky-based scheme (QRC), confining the dominant cost to generalized matrix multiplication (GEMM) and small triangular solves that map naturally onto systolic arrays. We further show that performing the preconditioned CG inversion in the beamspace domain induces sparsification of the system matrix and provides additional convergence acceleration at negligible transformation cost. Ray-tracing simulations confirm that the joint scheme reduces the required CG iteration count by two to three while matching the post-equalization SINR of the exact inversion.