🤖 AI Summary
To address spectrum scarcity in edge video transmission, the lack of scene awareness in conventional semantic communication, and severe static-frame redundancy, this paper proposes a lightweight framework that deeply integrates semantic communication with visual perception. Our method introduces a dynamic scene-aware mechanism driven by object detection/segmentation and designs a compression-ratio-adaptive semantic encoder to enable on-demand, semantic-level transmission. The framework incorporates lightweight visual understanding, context-aware decision-making, and edge-device-friendly deployment. Simulation results demonstrate that, while preserving critical semantic accuracy, the proposed approach improves spectral efficiency by over 40% and reduces redundant static-frame transmission by more than 90%, thereby significantly overcoming the “blind compression” limitation inherent in traditional semantic communication systems.
📝 Abstract
Despite the widespread adoption of vision sensors in edge applications, such as surveillance, the transmission of video data consumes substantial spectrum resources. Semantic communication (SC) offers a solution by extracting and compressing information at the semantic level, preserving the accuracy and relevance of transmitted data while significantly reducing the volume of transmitted information. However, traditional SC methods face inefficiencies due to the repeated transmission of static frames in edge videos, exacerbated by the absence of sensing capabilities, which results in spectrum inefficiency. To address this challenge, we propose a SC with computer vision sensing (SCCVS) framework for edge video transmission. The framework first introduces a compression ratio (CR) adaptive SC (CRSC) model, capable of adjusting CR based on whether the frames are static or dynamic, effectively conserving spectrum resources. Additionally, we implement an object detection and semantic segmentation models-enabled sensing (OSMS) scheme, which intelligently senses the changes in the scene and assesses the significance of each frame through in-context analysis. Hence, The OSMS scheme provides CR prompts to the CRSC model based on real-time sensing results. Moreover, both CRSC and OSMS are designed as lightweight models, ensuring compatibility with resource-constrained sensors commonly used in practical edge applications. Experimental simulations validate the effectiveness of the proposed SCCVS framework, demonstrating its ability to enhance transmission efficiency without sacrificing critical semantic information.