Semantic Ordered Statistics Decoding

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the suboptimal soft-decoding performance of short codes under finite blocklength and bursty error channels by proposing an ordered statistics decoding (OSD) method that integrates byte-level Transformer language model priors. A novel bit–byte joint scoring mechanism is designed to jointly leverage channel reliability and semantic priors, thereby refining the selection of the most reliable basis and the ranking of candidate codewords. For the first time, a dual-path complementary error-pattern enumeration strategy is introduced, combining conventional bit-flipping with language-model-guided byte substitution. Experimental results demonstrate that, over AWGN channels, the proposed approach achieves block error rates below the finite-blocklength normal approximation bound for BCH(127,64) and shortened RS(16,8) codes, yielding a 1.5 dB gain over conventional Fossorier OSD. In Gilbert–Elliott burst-error channels, it outperforms Berlekamp–Massey and standard OSD by 4 dB and 1 dB, respectively.

📝 Abstract

We propose a Semantic Ordered Statistics Decoder (sem-OSD), a soft decoder for short linear block codes carrying byte-streamed sources such as natural-language text. Sem-OSD injects a byte-level language-model (LM) prior into ordered statistics decoding (OSD) through a fused bit-level score that combines channel reliability with the LM prior, and uses it for the most-reliable basis (MRB) selection and the codeword candidate scoring. Sem-OSD enumerates two complementary test-error-pattern (TEP) families: a bit-flip family that flips up to $m$ bits, and an LM-driven family of up to $ω$ byte substitutions that reaches error patterns the bit-flip family cannot. The LM prior is computed by a byte-level Transformer fine-tuned for byte-level denoising. Simulation results show that, on AWGN, sem-OSD achieves block error rates (BLERs) below the finite-blocklength normal-approximation bound for uniform sources on both binary BCH$(127,64)$ and shortened RS$(16,8)$ over GF(256), exceeding Fossorier OSD by a $1.5$ dB coding gain. On a Gilbert--Elliott burst-error channel, sem-OSD provides $4$ dB and $1$ dB of more coding gain than Berlekamp--Massey and OSD, respectively.

Problem

Research questions and friction points this paper is trying to address.

semantic decoding

short block codes

language model prior

ordered statistics decoding

byte-streamed sources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic Ordered Statistics Decoding

Language Model Prior

Byte-level Transformer