๐ค AI Summary
This work addresses the fundamental trade-off between microsecond-scale latency and high decoding accuracy that hinders the practical deployment of neural decoders in quantum error correction. Under explicit accuracyโlatency constraints, the authors unify and reconstruct five representative surface code neural decoder architectures, proposing an end-to-end compression pipeline and demonstrating FPGA deployment supporting code distances up to $d=9$. Their findings reveal that recent gains in decoding performance are primarily driven by training data volume rather than architectural complexity, that effective inductive bias is crucial for achieving high accuracy, and that INT4 quantization is essential to meet microsecond-level latency requirements. This study provides a viable technical pathway and empirical foundation for scalable, real-time neural quantum error correction.
๐ Abstract
Quantum error correction (QEC) is essential for enabling quantum advantages, with decoding as a central algorithmic primitive. Owing to its importance and intrinsic difficulty, substantial effort has been made to QEC decoder design, among which neural decoders have recently emerged as a promising data-driven paradigm. Despite this progress, practical deployment remains hindered by a fundamental accuracy-latency tradeoff, often on the microsecond timescale. To address this challenge, here we revisit neural decoders for surface-code decoding under explicit accuracy-latency constraints, considering code distances up to d=9 (161 physical qubits). We unify and redesign representative neural decoders into five architectural paradigms and develop an end-to-end compression pipeline to evaluate their deployability and performance on FPGA hardware. Through systematic experiments, we reveal several previously underexplored insights: (i) near-term decoding performance is driven more by data scale than architectural complexity; (ii) appropriate inductive bias is essential for achieving high decoding accuracy; and (iii) INT4 quantization is a prerequisite for meeting microsecond-scale latency requirements on FPGAs. Together, these findings provide concrete guidance toward scalable and real-time neural QEC decoding.