🤖 AI Summary
This work exposes a fundamental flaw in the long-standing “noise coordinate independence assumption” underpinning security analyses of quasi-cyclic code-based public-key cryptography (e.g., HQC): under sparse vector–quasi-cyclic matrix multiplication and addition, output noise coordinates exhibit significant, non-negligible statistical dependencies.
Method: Integrating probabilistic analysis, coding theory, and quasi-cyclic algebra, we construct the first concrete attack scenario wherein the assumption fails, quantitatively characterizing how such dependencies inflate the decryption failure rate (DFR).
Contribution/Results: We demonstrate that while noise weight remains concentrated, its distribution is not independent—causing conventional DFR upper bounds to severely underestimate the true failure probability. Our analysis rectifies the security model of HQC and related schemes, yielding tighter security bounds. Moreover, it establishes foundational groundwork for worst-case-to-average-case reductions in quasi-cyclic code-based cryptography.
📝 Abstract
Cryptography based on the presumed hardness of decoding codes -- i.e., code-based cryptography -- has recently seen increased interest due to its plausible security against quantum attackers. Notably, of the four proposals for the NIST post-quantum standardization process that were advanced to their fourth round for further review, two were code-based. The most efficient proposals -- including HQC and BIKE, the NIST submissions alluded to above -- in fact rely on the presumed hardness of decoding structured codes. Of particular relevance to our work, HQC is based on quasi-cyclic codes, which are codes generated by matrices consisting of two cyclic blocks. In particular, the security analysis of HQC requires a precise understanding of the Decryption Failure Rate (DFR), whose analysis relies on the following heuristic: given random ``sparse'' vectors $e_1,e_2$ (say, each coordinate is i.i.d. Bernoulli) multiplied by fixed ``sparse'' quasi-cyclic matrices $A_1,A_2$, the weight of resulting vector $e_1A_1+e_2A_2$ is very concentrated around its expectation. In the documentation, the authors model the distribution of $e_1A_1+e_2A_2$ as a vector with independent coordinates (and correct marginal distribution). However, we uncover cases where this modeling fails. While this does not invalidate the (empirically verified) heuristic that the weight of $e_1A_1+e_2A_2$ is concentrated, it does suggest that the behavior of the noise is a bit more subtle than previously predicted. Lastly, we also discuss implications of our result for potential worst-case to average-case reductions for quasi-cyclic codes.