DocQT: Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

197K/year
🤖 AI Summary
This study addresses the limited generalization of existing document tampering localization models in real-world applications, which stems from their training on a narrow distribution of JPEG quantization tables. The work is the first to reveal the critical impact of quantization table diversity on the robustness of forgery localization and proposes an architecture—such as Mesorch—that explicitly incorporates quantization tables as model inputs. To support this approach, the authors introduce DocQT, the first diverse quantization table dataset tailored to authentic insurance documents. By employing a realistic quantization table sampling strategy (Real-QT) derived from actual business images, the method achieves substantial performance gains on the DocTamper benchmark, significantly reducing pixel-level false positive rates in real documents and demonstrating the practical efficacy of quantization-aware architectures for deployment.
📝 Abstract
Document manipulation localization models achieve strong performance on public benchmarks yet fail to generalize to operational document workflows. We identify a critical and overlooked source of this gap: the mismatch between the narrow distribution of JPEG quantization tables used during training -restricted to standard libjpeg quality factors -and the heterogeneous compression profiles encountered in real-world insurance document pipelines. To isolate this factor, we conduct a controlled factorial study comparing two architectures with contrasting levels of quantization table awareness -FFDN [2] and Mesorch [20] -each trained under either standard quality factor augmentation (Standard-QT ) or operationally calibrated quantization tables sampled from DocQT, a quantization-table bank derived from a MAIF operational image corpus (Real-QT ), and evaluated under three recompression conditions. Training under Real-QT yields substantial localization gains on DocTamper [15] and significantly reduces the pixel-level false positive rate on authentic operational documents, but only for architectures that explicitly ingest the quantization table as input. The released DocQT quantization-table dataset and compression-reproduction material are directly available at https://github.com/Kyliroco/Improving-Document-Forgery-Localization-Robustness-via-Diverse-JPEG-Quantization-Tables. These results demonstrate that standard quality factor augmentation does not adequately proxy operational compression diversity, and that architectural choices explicitly conditioning on the quantization table provide a meaningful robustness advantage for real-world deployment.
Problem

Research questions and friction points this paper is trying to address.

document forgery localization
JPEG quantization tables
compression diversity
generalization gap
operational document workflows
Innovation

Methods, ideas, or system contributions that make the work stand out.

JPEG quantization tables
document forgery localization
compression diversity
real-world robustness
quantization-aware architecture
🔎 Similar Papers
2024-07-26International Workshop on Information Forensics and SecurityCitations: 5