Parallel Quadratic Selected Inversion in Quantum Transport Simulation

πŸ“… 2026-01-08
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the limited scalability of quantum transport simulations for nanoscale multi-terminal transistors, which is hindered by the serial nature of existing algorithms, their reliance on block tridiagonal matrix assumptions, and shared-memory parallelization. To overcome these limitations, this work proposes a distributed GPU-parallel algorithm that, for the first time, concurrently parallelizes selected inversion within the recursive Green’s function (RGF) framework and the solution of quadratic matrix equations. The method supports a more general arrowhead block tridiagonal matrix structure, enabling efficient simulation of multi-terminal devices. Demonstrated on a nanoribbon transistor, the non-equilibrium Green’s function (NEGF) simulation achieves a 5.2Γ— speedup over the PARDISO-based selected inversion module using 16 GPUs and scales to device structures 16 times longer.

Technology Category

Application Category

πŸ“ Abstract
Driven by Moore's Law, the dimensions of transistors have been pushed down to the nanometer scale. Advanced quantum transport (QT) solvers are required to accurately simulate such nano-devices. The non-equilibrium Green's function (NEGF) formalism lends itself optimally to these tasks, but it is computationally very intensive, involving the selected inversion (SI) of matrices and the selected solution of quadratic matrix (SQ) equations. Existing algorithms to tackle these numerical problems are ideally suited to GPU acceleration, e.g., the so-called recursive Green's function (RGF) technique, but they are typically sequential, require block-tridiagonal (BT) matrices as inputs, and their implementation has been so far restricted to shared memory parallelism, thus limiting the achievable device sizes. To address these shortcomings, we introduce distributed methods that build on RGF and enable parallel selected inversion and selected solution of the quadratic matrix equation. We further extend them to handle BT matrices with arrowhead, which allows for the investigation of multi-terminal transistor structures. We evaluate the performance of our approach on a real dataset from the QT simulation of a nano-ribbon transistor and compare it with the sparse direct package PARDISO. When scaling to 16 GPUs, our fused SI and SQ solver is 5.2x faster than the SI module of PARDISO applied to a device 16x shorter. These results highlight the potential of our method to accelerate NEGF-based nano-device simulations.
Problem

Research questions and friction points this paper is trying to address.

quantum transport
selected inversion
quadratic matrix equation
non-equilibrium Green's function
nanoscale transistor simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel Selected Inversion
Quadratic Matrix Equation
Distributed GPU Computing
Block-Tridiagonal with Arrowhead
Non-equilibrium Green's Function
πŸ”Ž Similar Papers
No similar papers found.
V
V. Maillou
D-ITET, ETH Zurich, Zurich, Switzerland
M
Matthias Bollhofer
Institute for Numerical Analysis, TUBS, Braunschweig, Germany
Olaf Schenk
Olaf Schenk
Professor, Institute of Computing, Universita della Svizzera italiana, SIAM Fellow
High Performance ComputingComputational EngineeringComputational ScienceSimulation
A
A. Ziogas
D-ITET, ETH Zurich, Zurich, Switzerland
Mathieu Luisier
Mathieu Luisier
ETH Zurich
Computational nanoelectronicsdevice modeling