ChronoSC: Task-Oriented Semantic Communication via Temporal-to-Color Encoding

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
This work addresses the challenges of high bandwidth consumption, latency, and deployment difficulty on resource-constrained devices in existing video semantic communication methods for video question answering. The authors propose a lightweight semantic communication framework featuring a novel Chrono-Color Stacking mechanism that losslessly maps temporal video information into a single static color image, enabling extreme temporal compression and explicit visual reconstruction without complex spatiotemporal modeling. Integrated with a lightweight DeepJSCC transceiver and a pretrained BLIP vision-language model, the system achieves up to 192× bandwidth compression on the CLEVRER dataset while maintaining competitive question-answering accuracy.
📝 Abstract
Semantic communication (SC) aims to reduce transmission overhead by conveying task-relevant information rather than raw data. However, existing SC approaches for video largely focus on pixel-level reconstruction or rely on complex spatiotemporal pipelines, leading to excessive bandwidth usage and latency that are unsuitable for low-resource deployments. In this paper, we propose ChronoSC, a task-oriented semantic communication framework for Video Question Answering (VideoQA). ChronoSC introduces Chrono-Color Stacking, a lightweight and lossless projection scheme that encodes temporal video dynamics into a single static image, enabling extreme temporal compression before transmission. This compact semantic representation is transmitted using a lightweight Deep Joint Source-Channel Coding (DeepJSCC) transceiver and explicitly reconstructed at the receiver. Unlike latent-space methods, explicit visual reconstruction enables the direct reuse of pre-trained vision-language models; specifically, a pre-trained BLIP model is employed to infer answers from noisy, reconstructed chrono-images. Experiments on the CLEVRER dataset show that ChronoSC achieves up to 192 times bandwidth reduction compared to raw video transmission while maintaining high VideoQA accuracy.
Problem

Research questions and friction points this paper is trying to address.

Semantic Communication
Video Question Answering
Bandwidth Efficiency
Temporal Compression
Low-Resource Deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal-to-Color Encoding
Task-Oriented Semantic Communication
Chrono-Color Stacking
Deep Joint Source-Channel Coding
Video Question Answering
🔎 Similar Papers