The Construction of Near-optimal Universal Coding of Integers

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the optimization of the minimum expansion factor for universal integer prefix codes targeting discrete memoryless sources over unknown distributions and infinite alphabets. Existing optimal UCI-class codes achieve a minimum expansion factor between 2 and 2.5, with a theoretical lower bound of 2. We propose a novel ν-code construction that reduces the upper bound to 2.0386 and rigorously prove that both Δδ-codes and ν-codes attain the current best-known expansion factor. Methodologically, we integrate prefix coding theory, asymptotic analysis, and entropy-based modeling, introducing an improved mathematical proof framework that enables the first simultaneous tightening of both upper and lower bounds on the expansion factor. Our results significantly narrow the theoretical gap for optimal universal integer coding, bringing it closer to the information-theoretic lower bound and advancing foundational theory in universal compression.

Technology Category

Application Category

📝 Abstract
Universal Coding of Integers (UCI) is suitable for discrete memoryless sources with unknown probability distributions and infinitely countable alphabet sizes. The UCI is a class of prefix codes, such that the ratio of the average codeword length to $max{1, H(P)}$ is within a constant expansion factor $K_{mathcal{C}}$ for any decreasing probability distribution $P$, where $H(P)$ is the entropy of $P$. For any UCI code $mathcal{C}$, define emph{the minimum expansion factor} $K_{mathcal{C}}^{*}$ to represent the infimum of the set of extension factors of $mathcal{C}$. Each $mathcal{C}$ has a unique corresponding $K_{mathcal{C}}^{*}$, and the smaller $K_{mathcal{C}}^{*}$ is, the better the compression performance of $mathcal{C}$ is. A class of UCI $mathcal{C}$ (or family ${mathcal{C}_i}_{i=1}^{infty}$) achieving the smallest $K_{mathcal{C}}^{*}$ is defined as the emph{optimal UCI}. The best result currently is that the range of $C_{mathcal{C}}^{*}$ for the optimal UCI is $2leq C_{mathcal{C}}^{*}leq 2.5$. In this paper, we prove that there exists a class of near-optimal UCIs, called $ν$ code, to achieve $K_ν=2.0386$. This narrows the range of the minimum expansion factor for optimal UCI to $2leq C_{mathcal{C}}^{*}leq 2.0386$. Another new class of UCI, called $Δδ$ code, is specifically constructed. We show that the $Δδ$ code and $ν$ code are currently optimal in terms of minimum expansion factor. In addition, we propose a new proof that shows the minimum expansion factor of the optimal UCI is lower bounded by $2$.
Problem

Research questions and friction points this paper is trying to address.

Constructs near-optimal universal integer codes for compression
Narrows minimum expansion factor range for optimal UCI
Proves lower bound of 2 for optimal UCI expansion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructs near-optimal UCI with Kν=2.0386
Introduces Δδ code for optimal compression
Proves optimal UCI lower bound is 2
🔎 Similar Papers
No similar papers found.