🤖 AI Summary
This paper addresses the optimization of the minimum expansion factor for universal integer prefix codes targeting discrete memoryless sources over unknown distributions and infinite alphabets. Existing optimal UCI-class codes achieve a minimum expansion factor between 2 and 2.5, with a theoretical lower bound of 2. We propose a novel ν-code construction that reduces the upper bound to 2.0386 and rigorously prove that both Δδ-codes and ν-codes attain the current best-known expansion factor. Methodologically, we integrate prefix coding theory, asymptotic analysis, and entropy-based modeling, introducing an improved mathematical proof framework that enables the first simultaneous tightening of both upper and lower bounds on the expansion factor. Our results significantly narrow the theoretical gap for optimal universal integer coding, bringing it closer to the information-theoretic lower bound and advancing foundational theory in universal compression.
📝 Abstract
Universal Coding of Integers (UCI) is suitable for discrete memoryless sources with unknown probability distributions and infinitely countable alphabet sizes. The UCI is a class of prefix codes, such that the ratio of the average codeword length to $max{1, H(P)}$ is within a constant expansion factor $K_{mathcal{C}}$ for any decreasing probability distribution $P$, where $H(P)$ is the entropy of $P$. For any UCI code $mathcal{C}$, define emph{the minimum expansion factor} $K_{mathcal{C}}^{*}$ to represent the infimum of the set of extension factors of $mathcal{C}$. Each $mathcal{C}$ has a unique corresponding $K_{mathcal{C}}^{*}$, and the smaller $K_{mathcal{C}}^{*}$ is, the better the compression performance of $mathcal{C}$ is. A class of UCI $mathcal{C}$ (or family ${mathcal{C}_i}_{i=1}^{infty}$) achieving the smallest $K_{mathcal{C}}^{*}$ is defined as the emph{optimal UCI}. The best result currently is that the range of $C_{mathcal{C}}^{*}$ for the optimal UCI is $2leq C_{mathcal{C}}^{*}leq 2.5$. In this paper, we prove that there exists a class of near-optimal UCIs, called $ν$ code, to achieve $K_ν=2.0386$. This narrows the range of the minimum expansion factor for optimal UCI to $2leq C_{mathcal{C}}^{*}leq 2.0386$. Another new class of UCI, called $Δδ$ code, is specifically constructed. We show that the $Δδ$ code and $ν$ code are currently optimal in terms of minimum expansion factor. In addition, we propose a new proof that shows the minimum expansion factor of the optimal UCI is lower bounded by $2$.