MT-Mark: Rethinking Image Watermarking via Mutual-Teacher Collaboration with Adaptive Feature Modulation

📅 2025-12-22

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Existing deep image watermarking methods optimize the embedder and extractor independently, coupling them only weakly via the final loss—lacking decoding-aware guidance and collaborative learning. This paper proposes a bidirectional mutual-teacher framework that models the embedder and extractor as interactive, mutually supervising modules for end-to-end joint optimization. Key contributions include: (1) a Collaborative Interaction Mechanism (CIM) enabling bidirectional feature-level information exchange; (2) an Adaptive Feature Modulation Module (AFMM) for content-aware robust representation learning; and (3) a decoupled feature regulation strategy that explicitly separates content and watermark representations. Experiments demonstrate substantial improvements in watermark extraction accuracy on both natural and AI-generated images, while maintaining high visual fidelity, strong robustness against common distortions, and superior cross-domain generalization.

Technology Category

Application Category

📝 Abstract

Existing deep image watermarking methods follow a fixed embedding-distortion-extraction pipeline, where the embedder and extractor are weakly coupled through a final loss and optimized in isolation. This design lacks explicit collaboration, leaving no structured mechanism for the embedder to incorporate decoding-aware cues or for the extractor to guide embedding during training. To address this architectural limitation, we rethink deep image watermarking by reformulating embedding and extraction as explicitly collaborative components. To realize this reformulation, we introduce a Collaborative Interaction Mechanism (CIM) that establishes direct, bidirectional communication between the embedder and extractor, enabling a mutual-teacher training paradigm and coordinated optimization. Built upon this explicitly collaborative architecture, we further propose an Adaptive Feature Modulation Module (AFMM) to support effective interaction. AFMM enables content-aware feature regulation by decoupling modulation structure and strength, guiding watermark embedding toward stable image features while suppressing host interference during extraction. Under CIM, the AFMMs on both sides form a closed-loop collaboration that aligns embedding behavior with extraction objectives. This architecture-level redesign changes how robustness is learned in watermarking systems. Rather than relying on exhaustive distortion simulation, robustness emerges from coordinated representation learning between embedding and extraction. Experiments on real-world and AI-generated datasets demonstrate that the proposed method consistently outperforms state-of-the-art approaches in watermark extraction accuracy while maintaining high perceptual quality, showing strong robustness and generalization.

Problem

Research questions and friction points this paper is trying to address.

Establishes explicit collaboration between embedder and extractor

Enables content-aware feature modulation for stable watermark embedding

Improves robustness through coordinated representation learning instead of distortion simulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mutual-teacher training paradigm for embedder-extractor collaboration

Adaptive Feature Modulation Module for content-aware regulation

Closed-loop collaboration aligning embedding with extraction objectives

🔎 Similar Papers

No similar papers found.