🤖 AI Summary
Existing quantum error correction decoders struggle to simultaneously achieve high accuracy, speed, universality, and scalability, rendering them inadequate for real-time fault-tolerant computation at the million-qubit scale. This work proposes the Lottery BP decoder, which introduces stochasticity into belief propagation for the first time, dramatically improving decoding accuracy—by 2 to 8 orders of magnitude over conventional BP on topological codes. It further incorporates a syndrome vote preprocessing step to compress multi-round measurement errors and mitigate latency buildup. A hybrid PolyQec architecture is developed, combining local Lottery BP with global ordered statistics decoding (OSD), reducing OSD invocation frequency by 3 to 5 orders of magnitude. Additionally, Syndrilla—a modular, GPU-accelerated simulation framework built on PyTorch—is introduced, enabling simulations up to 1–2 orders of magnitude faster than CPU-based counterparts.
📝 Abstract
To enable fault tolerance on millions of qubits in real time, scalable decoding is necessary, which motivates this paper. Existing decoding algorithms (decoders), such as clustering, matching, belief propagation (BP), and neural networks, suffer from one or more of inaccuracy, costliness, and incompatibility, upon a broad set of quantum error correction codes, such as surface code, toric code, and bivariate bicycle code. Therefore, there exists a gap between existing decoders and an ideal decoder that is accurate, fast, general, and scalable simultaneously. This paper contributes in three aspects, including decoder, decoder architecture, and decoding simulator. First, we propose Lottery BP, a decoder that introduces randomness during decoding. Lottery BP improves the decoding accuracy over BP by 2~8 orders of magnitude for topological codes. To efficiently decode multi-round measurement errors, we propose syndrome vote as a pre-processing step before Lottery BP, which compresses multiple rounds of syndromes into one. Syndrome voting increases the latency margin of decoding and mitigates the backlog problem. Second, we design a PolyQec architecture that implements Lottery BP as a local decoder and ordered statistics decoding (OSD) as a global decoder, and it is configurable for surface/toric code and X/Z check. Since Lottery BP boosts the local decoding accuracy, PolyQec invokes the costly global OSD decoder less frequently over BP+OSD to enhance the scalability, e.g., 3~5 orders of magnitude less for topological codes. Third, to evaluate decoders fairly, we develop a PyTorch-based decoding simulator, Syndrilla, that modularizes the simulation pipeline and allows to extend new decoders flexibly. We formulate multiple metrics to quantify the performance of decoders and integrate them in Syndrilla. Running on GPUs, Syndrilla is 1~2 orders of magnitude faster than CPUs.