🤖 AI Summary
This study addresses the longstanding trade-off between payload capacity and visual fidelity in steganography. We propose a novel attention-driven steganographic method that explicitly models human visual non-uniformity via foveated rendering, integrated with multimodal latent-space representation and a deep steganographic network. Our approach increases embedding capacity from 100 to 500 bits per cover image while preserving perceptual quality through a perception-aware loss function. Quantitatively, it achieves high visual fidelity with PSNR of 31.47 dB and LPIPS of 0.13. Bit error rate is reduced to 0.005% (1/20,000) over 200,000 test bits. To our knowledge, this is the first work to systematically incorporate foveated perception modeling into a steganographic framework, thereby breaking the conventional capacity–quality bottleneck. The method establishes a new paradigm for applications such as image metadata embedding and robust watermarking.
📝 Abstract
Steganography finds its use in visual medium such as providing metadata and watermarking. With support of efficient latent representations and foveated rendering, we trained models that improve existing capacity limits from 100 to 500 bits, while achieving better accuracy of up to 1 failure bit out of 2000, at 200K test bits. Finally, we achieve a comparable visual quality of 31.47 dB PSNR and 0.13 LPIPS, showing the effectiveness of novel perceptual design in creating multi-modal latent representations in steganography.