Object-Attribute-Relation Representation based Video Semantic Communication

📅 2024-06-15

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address the inefficiency and poor semantic interpretability of video transmission under low-bandwidth, high-noise conditions, this paper proposes a video representation framework structured around object-attribute-relation (OAR) triplets as fundamental semantic units. Unlike end-to-end joint source-channel coding (JSCC) approaches that employ opaque, black-box semantic learning, our method decouples semantic modeling from channel coding: it first extracts and serializes OAR triplets, then designs a semantic-driven JSCC co-optimization mechanism, and finally constructs an OAR-guided generative reconstruction network. Evaluations on a traffic surveillance dataset demonstrate that the proposed method achieves significantly higher reconstruction quality than H.265 at extremely low bitrates, while markedly improving downstream task performance. To the best of our knowledge, this is the first work to realize semantic-level video communication that is interpretable, editable, and task-adaptive.

Technology Category

Application Category

📝 Abstract

With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding (JSCC) that depends on end-to-end training. These methods often lack an interpretable semantic representation and struggle with adaptability to various downstream tasks. In this paper, we introduce the use of object-attribute-relation (OAR) as a semantic framework for videos to facilitate low bit-rate coding and enhance the JSCC process for more effective video transmission. We utilize OAR sequences for both low bit-rate representation and generative video reconstruction. Additionally, we incorporate OAR into the image JSCC model to prioritize communication resources for areas more critical to downstream tasks. Our experiments on traffic surveillance video datasets assess the effectiveness of our approach in terms of video transmission performance. The empirical findings demonstrate that our OAR-based video coding method not only outperforms H.265 coding at lower bit-rates but also synergizes with JSCC to deliver robust and efficient video transmission.

Problem

Research questions and friction points this paper is trying to address.

Enhance video transmission efficiency

Improve semantic communication interpretability

Optimize resource allocation for downstream tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

OAR semantic framework

low bit-rate coding

generative video reconstruction

🔎 Similar Papers

No similar papers found.