All-in-One Transferring Image Compression from Human Perception to Multi-Machine Perception

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Learning-based image compression models struggle to simultaneously support multiple machine vision tasks (e.g., segmentation, detection), leading to redundant bitstreams, inefficient task adaptation, and task fragmentation. Method: We propose a heterogeneous adaptive compression framework that freezes a pre-trained encoder-decoder and introduces (i) a shared semantic adapter to extract cross-task generalizable representations, and (ii) lightweight task-specific adapters to preserve discriminative features. The framework employs parameter-efficient fine-tuning (PEFT) jointly optimized via multi-task loss. Contribution/Results: Evaluated under the PASCAL-Context multi-task benchmark, our method supports diverse downstream tasks with a single compressed bitstream. It significantly outperforms full fine-tuning and state-of-the-art PEFT approaches, achieving superior trade-offs between compression rate and task performance. To our knowledge, this is the first work to validate a unified compression-and-transfer paradigm for multi-perception tasks.

Technology Category

Application Category

📝 Abstract
Efficiently transferring Learned Image Compression (LIC) model from human perception to machine perception is an emerging challenge in vision-centric representation learning. Existing approaches typically adapt LIC to downstream tasks in a single-task manner, which is inefficient, lacks task interaction, and results in multiple task-specific bitstreams. To address these limitations, we propose an asymmetric adaptor framework that supports multi-task adaptation within a single model. Our method introduces a shared adaptor to learn general semantic features and task-specific adaptors to preserve task-level distinctions. With only lightweight plug-in modules and a frozen base codec, our method achieves strong performance across multiple tasks while maintaining compression efficiency. Experiments on the PASCAL-Context benchmark demonstrate that our method outperforms both Fully Fine-Tuned and other Parameter Efficient Fine-Tuned (PEFT) baselines, and validating the effectiveness of multi-vision transferring.
Problem

Research questions and friction points this paper is trying to address.

Transferring image compression from human to multi-machine perception
Addressing inefficiency in single-task LIC adaptation for downstream tasks
Proposing asymmetric adaptor framework for multi-task model adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Asymmetric adaptor framework for multi-task adaptation
Shared and task-specific adaptors for semantic features
Lightweight plug-in modules with frozen base codec
🔎 Similar Papers
No similar papers found.