Semantic Manipulation Localization

📅 2026-04-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

184K/year
🤖 AI Summary
This work addresses the limitations of existing image manipulation localization methods, which primarily rely on low-level artifact detection and struggle to identify subtle yet semantically critical forgeries. To bridge this gap, we introduce a novel task—semantic manipulation localization—and present the first fine-grained benchmark dataset tailored for this purpose. We further propose TRACE, an end-to-end framework that integrates semantic anchoring, frequency-domain perturbation awareness, and joint reasoning over semantic content and spatial extent to precisely localize manipulated regions even under high semantic consistency. Experiments demonstrate that TRACE significantly outperforms current approaches on our benchmark, yielding more complete, compact, and semantically coherent localization results. This study underscores the pivotal role of semantic awareness in image forensics and establishes a new paradigm for semantics-driven manipulation localization.

Technology Category

Application Category

📝 Abstract
Image Manipulation Localization (IML) aims to identify edited regions in an image. However, with the increasing use of modern image editing and generative models, many manipulations no longer exhibit obvious low-level artifacts. Instead, they often involve subtle but meaning-altering edits to an object's attributes, state, or relationships while remaining highly consistent with the surrounding content. This makes conventional IML methods less effective because they mainly rely on artifact detection rather than semantic sensitivity. To address this issue, we introduce Semantic Manipulation Localization (SML), a new task that focuses on localizing subtle semantic edits that significantly change image interpretation. We further construct a dedicated fine-grained benchmark for SML using a semantics-driven manipulation pipeline with pixel-level annotations. Based on this task, we propose TRACE (Targeted Reasoning of Attributed Cognitive Edits), an end-to-end framework that models semantic sensitivity through three progressively coupled components: semantic anchoring, semantic perturbation sensing, and semantic-constrained reasoning. Specifically, TRACE first identifies semantically meaningful regions that support image understanding, then injects perturbation-sensitive frequency cues to capture subtle edits under strong visual consistency, and finally verifies candidate regions through joint reasoning over semantic content and semantic scope. Extensive experiments show that TRACE consistently outperforms existing IML methods on our benchmark and produces more complete, compact, and semantically coherent localization results. These results demonstrate the necessity of moving beyond artifact-based localization and provide a new direction for image forensics in complex semantic editing scenarios.
Problem

Research questions and friction points this paper is trying to address.

Semantic Manipulation Localization
Image Manipulation Localization
semantic edits
visual consistency
image forensics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic Manipulation Localization
TRACE
semantic sensitivity
image forensics
fine-grained benchmark
🔎 Similar Papers
No similar papers found.
Zhenshan Tan
Zhenshan Tan
Nanjing University of Information Science and Technology
Computer VisionCross-ModalNetwork and Information Security
C
Chenhan Lu
Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Yuxiang Huang
Yuxiang Huang
Tsinghua University
Efficient AINatural Language ProcessingMachine Learning System
Ziwen He
Ziwen He
Nanjing University of Information Sciences and Technology
X
Xiang Zhang
Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Y
Yuzhe Sha
Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, 210044, China
X
Xianyi Chen
Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Tianrun Chen
Tianrun Chen
Zhejiang University
Computer Vision3D ReconstructionComputational ImagingLarge Vision-Language Model
Z
Zhangjie Fu
Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science and Technology, Nanjing, 210044, China