Stereo Image Coding for Machines with Joint Visual Feature Compression

📅 2025-02-20

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

To address the low coding efficiency of stereo images in machine vision tasks, this paper proposes SICM, a machine-oriented end-to-end stereo image coding framework. Unlike conventional human-vision-centric compression paradigms, SICM introduces the first joint feature compression architecture explicitly designed for downstream 3D tasks—such as depth estimation and stereo matching. It features a Stereo Multi-scale Feature Compression (SMFC) module that simultaneously eliminates spatial, inter-view, and cross-scale redundancies, yielding compact yet discriminative binocular representations. The framework integrates differentiable quantization, entropy modeling, and joint rate-distortion optimization. Experimental results demonstrate that SICM significantly outperforms both the MPEG-recommended ICM baseline and state-of-the-art stereo image coding (SIC) methods in terms of compression efficiency and 3D task performance.

Technology Category

Application Category

📝 Abstract

2D image coding for machines (ICM) has achieved great success in coding efficiency, while less effort has been devoted to stereo image fields. To promote the efficiency of stereo image compression (SIC) and intelligent analysis, the stereo image coding for machines (SICM) is formulated and explored in this paper. More specifically, a machine vision-oriented stereo feature compression network (MVSFC-Net) is proposed for SICM, where the stereo visual features are effectively extracted, compressed, and transmitted for 3D visual task. To efficiently compress stereo visual features in MVSFC-Net, a stereo multi-scale feature compression (SMFC) module is designed to gradually transform sparse stereo multi-scale features into compact joint visual representations by removing spatial, inter-view, and cross-scale redundancies simultaneously. Experimental results show that the proposed MVSFC-Net obtains superior compression efficiency as well as 3D visual task performance, when compared with the existing ICM anchors recommended by MPEG and the state-of-the-art SIC method.

Problem

Research questions and friction points this paper is trying to address.

Enhances stereo image compression for machine analysis

Proposes MVSFC-Net for efficient stereo feature extraction

Achieves superior 3D visual task performance and efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stereo Image Coding for Machines

Machine Vision-Oriented Network

Stereo Multi-Scale Feature Compression

🔎 Similar Papers

Content-aware Masked Image Modeling Transformer for Stereo Image Compression