Adaptive Forensic Feature Refinement via Intrinsic Importance Perception

📅 2026-04-18

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

This work addresses the challenge of cross-distribution generalization in synthetic image detection under unknown generative sources by proposing the I2P framework. I2P formulates the adaptation of vision foundation models as a joint optimization problem: it adaptively selects the most discriminative representation layers for forgery cues through multi-level feature importance awareness, while simultaneously performing constrained fine-tuning within a low-sensitivity parameter subspace to preserve the transferability of pre-trained structures. This dual strategy significantly enhances the model’s generalization performance on images generated by unseen synthesis models, without compromising the open-set recognition capabilities inherent to the underlying vision foundation model.

Technology Category

Application Category

📝 Abstract

With the rapid development of generative models and multimodal content editing technologies, the key challenge faced by synthetic image detection (SID) lies in cross-distribution generalization to unknown generation sources. In recent years, visual foundation models (VFM), which acquire rich visual priors through large scale image-text alignment pretraining, have become a promising technical route for improving the generalization ability of SID. However, existing VFM-based methods remain relatively coarse-grained in their adaptation strategies. They typically either directly use the final layer representations of VFM or simply fuse multi layer features, lacking explicit modeling of the optimal representational hierarchy for transferable forgery cues. Meanwhile, although directly fine-tuning VFM can enhance task adaptation, it may also damage the cross-modal pretrained structure that supports open-set generalization. To address this task specific tension, we reformulate VFM adaptation for SID as a joint optimization problem: it is necessary both to identify the critical representational layer that is more suitable for carrying forgery discriminative information and to constrain the disturbance caused by task knowledge injection to the pretrained structure. Based on this, we propose I2P, an SID framework centered on intrinsic importance perception. I2P first adaptively identifies the critical layer representations that are most discriminative for SID, and then constrains task-driven parameter updates within a low sensitivity parameter subspace, thereby improving task specificity while preserving the transferable structure of pretrained representations as much as possible.

Problem

Research questions and friction points this paper is trying to address.

synthetic image detection

cross-distribution generalization

visual foundation models

pretrained structure preservation

unknown generation sources

Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic image detection

visual foundation models

intrinsic importance perception