Object-Attribute-Relation Representation based Video Semantic Communication

πŸ“… 2024-06-15
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 1
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the inefficiency and poor semantic interpretability of video transmission under low-bandwidth, high-noise conditions, this paper proposes a video representation framework structured around object-attribute-relation (OAR) triplets as fundamental semantic units. Unlike end-to-end joint source-channel coding (JSCC) approaches that employ opaque, black-box semantic learning, our method decouples semantic modeling from channel coding: it first extracts and serializes OAR triplets, then designs a semantic-driven JSCC co-optimization mechanism, and finally constructs an OAR-guided generative reconstruction network. Evaluations on a traffic surveillance dataset demonstrate that the proposed method achieves significantly higher reconstruction quality than H.265 at extremely low bitrates, while markedly improving downstream task performance. To the best of our knowledge, this is the first work to realize semantic-level video communication that is interpretable, editable, and task-adaptive.

Technology Category

Application Category

πŸ“ Abstract
With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding (JSCC) that depends on end-to-end training. These methods often lack an interpretable semantic representation and struggle with adaptability to various downstream tasks. In this paper, we introduce the use of object-attribute-relation (OAR) as a semantic framework for videos to facilitate low bit-rate coding and enhance the JSCC process for more effective video transmission. We utilize OAR sequences for both low bit-rate representation and generative video reconstruction. Additionally, we incorporate OAR into the image JSCC model to prioritize communication resources for areas more critical to downstream tasks. Our experiments on traffic surveillance video datasets assess the effectiveness of our approach in terms of video transmission performance. The empirical findings demonstrate that our OAR-based video coding method not only outperforms H.265 coding at lower bit-rates but also synergizes with JSCC to deliver robust and efficient video transmission.
Problem

Research questions and friction points this paper is trying to address.

Enhance video transmission efficiency
Improve semantic communication interpretability
Optimize resource allocation for downstream tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

OAR semantic framework
low bit-rate coding
generative video reconstruction
πŸ”Ž Similar Papers
No similar papers found.
Q
Qiyuan Du
Department of Electronic Engineering, Tsinghua University, Beijing, China; State Key Laboratory of Space Network and Communications, Beijing, China
Yiping Duan
Yiping Duan
Department of Electronic Engineering, Tsinghua University, Beijing, China; State Key Laboratory of Space Network and Communications, Beijing, China
Qianqian Yang
Qianqian Yang
Zhejiang University
Information TheoryWireless AISemantic CommunicationMachine Learning
Xiaoming Tao
Xiaoming Tao
Tsinghua University
Wireless multimedia communications
M
MΓ©rouane Debbah
KU 6G Research Centre, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; CentraleSupelec, Paris-Saclay University, 91192 Gif-sur-Yvette, France