๐ค AI Summary
This work addresses the challenge of semantic communication under limited information rates and computational resources, where the encoder and decoder pursue misaligned objectives. Focusing on strategic Gaussian semantic compression, the study jointly optimizes their distinct quadratic objectives under a rate constraint by designing the posterior covariance, modeling the decoder as an MMSE estimator, and introducing a โsemantic water-fillingโ principle alongside a Gaussian persuasion mechanism. Theoretical analysis reveals that model depth and inference time yield exponential gains in semantic accuracy and demonstrates that multimodal observations eliminate the geometric-mean penalty inherent in remote encoding. By establishing a theory of Gaussian optimality under objective misalignment, this work provides an information-theoretic foundation for resource-constrained efficient AI systems and offers a novel interpretation for posterior design in multimodal large language models.
๐ Abstract
We study strategic Gaussian semantic compression under rate and compute constraints, where an encoder and decoder optimize distinct quadratic objectives. A latent Gaussian state generates a task dependent semantic variable, and the decoder best responds via MMSE estimation, reducing the encoder's problem to posterior covariance design under an information rate constraint. We characterize the strategic rate distortion function in direct, remote, and full information regimes, derive semantic waterfilling and rate constrained Gaussian persuasion solutions, and establish Gaussian optimality under misaligned objectives. We further show that architectural compute limits act as implicit rate constraints, yielding exponential improvements in semantic accuracy with model depth and inference time compute, while multimodal observation eliminates the geometric mean penalty inherent to remote encoding. These results provide information theoretic foundations for data and energy efficient AI and offer a principled interpretation of modern multimodal language models as posterior design mechanisms under resource constraints.