🤖 AI Summary
This study investigates decoder dependence in surface code threshold estimation, focusing on performance disparities under Pauli noise versus native GKP Gaussian displacement digitization. Within a unified experimental framework, the authors systematically evaluate the thresholds and computational efficiency of multiple decoders—including Minimum-Weight Perfect Matching (MWPM), Union-Find, Belief Propagation, and neural-guided MWPM—across distinct noise models, while introducing parallel sampling to enhance throughput. A standardized threshold reporting protocol integrating runtime and logical fidelity is proposed, revealing significant sensitivity of threshold estimates to decoder choice. Experimental results demonstrate that MWPM exhibits superior stability, and at distance d=5 with squeezing parameter σ=0.20, both MWPM and Union-Find jointly form the Pareto frontier. Leveraging the LiDMaS+ v1.1.0 platform with cross-guidance and dense window scanning, parallel execution achieves up to a 1.94× speedup while preserving statistical fidelity.
📝 Abstract
We quantify decoder dependence in surface-code threshold studies under two matched regimes: Pauli noise and native GKP-style Gaussian displacement digitization. Using LiDMaS+ v1.1.0, we benchmark MWPM, Union-Find (UF), Belief Propagation (BP), and neural-guided MWPM with fixed seeds, identical sweep grids, and unified reporting across runs 06--14. At $d=5$ and $σ=0.20$, MWPM and UF define the Pareto frontier, with (runtime, LER) = (1.341 s, 0.2273) and (1.332 s, 0.2303); neural-guided MWPM is slower and less accurate (1.396 s, 0.3730), and BP is dominated (7.640 s, 0.6107). Crossing-bootstrap diagnostics are stable only for MWPM, with median $σ^\star_{3,5}=0.10$ (1911/2000 valid) and $σ^\star_{5,7}=0.1375$ (1941/2000 valid), while other decoders show no valid crossing samples. Dense-window scanning over $σ\in [0.08,0.24]$ returns NaN crossings for all decoders, confirming estimator- and window-sensitive threshold localization. Rank-stability and effect-size bootstrap analyses reinforce ordering robustness: BP remains rank 4, neural-guided MWPM rank 3, and MWPM-UF differences are small ($Δ_{\mathrm{MWPM-UF}}=-0.00383$, 95\% interval $[-0.0104,0.00329]$) across $σ\in [0.05,0.35]$. Threaded execution preserves statistical fidelity while improving throughput: $1.34\times$ speedup in Pauli mode and $1.94\times$ in native GKP mode, with mean $|Δ\mathrm{LER}|$ $6.07\times10^{-3}$ and $5.20\times10^{-3}$, respectively. We therefore recommend estimator-conditional threshold reporting coupled to runtime-fidelity checks for reproducible hardware-facing practical future decoder benchmarking workflows.